Warning: Permanently added '44.200.168.26' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9808137-fedora-rawhide-x86_64 --chroot fedora-rawhide-x86_64 Version: 1.6 PID: 8675 Logging PID: 8677 Task: {'allow_user_ssh': False, 'appstream': False, 'background': True, 'build_id': 9808137, 'buildroot_pkgs': [], 'chroot': 'fedora-rawhide-x86_64', 'enable_net': False, 'fedora_review': False, 'git_hash': 'c764b47d888ca7c9122e784f68f8b958d584340d', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'composable_kernel', 'package_version': '7.1.0-2', 'project_dirname': 'RH', 'project_name': 'RH', 'project_owner': '@rocm-packagers-sig', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/@rocm-packagers-sig/RH/fedora-rawhide-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}], 'sandbox': '@rocm-packagers-sig/RH--https://src.fedoraproject.org/user/trix', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'https://src.fedoraproject.org/user/trix', 'tags': [], 'task_id': '9808137-fedora-rawhide-x86_64', 'timeout': 180000, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/RH/composable_kernel', '/var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel'... Running: git checkout c764b47d888ca7c9122e784f68f8b958d584340d -- cmd: ['git', 'checkout', 'c764b47d888ca7c9122e784f68f8b958d584340d', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel rc: 0 stdout: stderr: Note: switching to 'c764b47d888ca7c9122e784f68f8b958d584340d'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at c764b47 automatic import of composable_kernel Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading composable_kernel-7.1.0.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o composable_kernel-7.1.0.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/@rocm-packagers-sig/RH/composable_kernel/composable_kernel-7.1.0.tar.gz/md5/1d8f397b684a9582489474a9e94ce7bd/composable_kernel-7.1.0.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 5238k 100 5238k 0 0 205M 0 --:--:-- --:--:-- --:--:-- 213M INFO: Reading stdout from command: md5sum composable_kernel-7.1.0.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=180000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1763473467.053604 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 6.5 starting (python version = 3.13.7, NVR = mock-6.5-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel/composable_kernel.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1763473467.053604 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel/composable_kernel.spec) Config(fedora-rawhide-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.5 INFO: Mock Version: 6.5 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1763473467.053604/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:rawhide INFO: Pulling image: registry.fedoraproject.org/fedora:rawhide INFO: Tagging container image as mock-bootstrap-096fdc71-b0f5-44f4-b116-600bae454e94 INFO: Checking that 3116db95e9176f50c5f02b2a3cd30d246ccc224b26a7965a6e19ea9cff00aad4 image matches host's architecture INFO: Copy content of container 3116db95e9176f50c5f02b2a3cd30d246ccc224b26a7965a6e19ea9cff00aad4 to /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1763473467.053604/root INFO: mounting 3116db95e9176f50c5f02b2a3cd30d246ccc224b26a7965a6e19ea9cff00aad4 with podman image mount INFO: image 3116db95e9176f50c5f02b2a3cd30d246ccc224b26a7965a6e19ea9cff00aad4 as /var/lib/containers/storage/overlay/f9a7f5aadbbdd0a59122a9dc300c9d3a6477fc48665db67534f3df7592cdcf3a/merged INFO: umounting image 3116db95e9176f50c5f02b2a3cd30d246ccc224b26a7965a6e19ea9cff00aad4 (/var/lib/containers/storage/overlay/f9a7f5aadbbdd0a59122a9dc300c9d3a6477fc48665db67534f3df7592cdcf3a/merged) with podman image umount INFO: Removing image mock-bootstrap-096fdc71-b0f5-44f4-b116-600bae454e94 INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-1763473467.053604/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc44.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.3.0.0-2.fc44.x86_64 dnf5-plugins-5.3.0.0-2.fc44.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 1.2 MiB/s | 227.0 KiB | 00m00s fedora 100% | 16.7 MiB/s | 21.7 MiB | 00m01s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 0:5.3.0-2.fc43 fedora 8.4 MiB bzip2 x86_64 0:1.0.8-21.fc43 fedora 95.3 KiB coreutils x86_64 0:9.8-3.fc44 fedora 5.4 MiB cpio x86_64 0:2.15-6.fc43 fedora 1.1 MiB diffutils x86_64 0:3.12-3.fc43 fedora 1.6 MiB fedora-release-common noarch 0:44-0.5 fedora 20.6 KiB findutils x86_64 1:4.10.0-6.fc43 fedora 1.8 MiB gawk x86_64 0:5.3.2-2.fc43 fedora 1.8 MiB glibc-minimal-langpack x86_64 0:2.42.9000-11.fc44 fedora 0.0 B grep x86_64 0:3.12-2.fc43 fedora 1.0 MiB gzip x86_64 0:1.14-1.fc44 fedora 397.8 KiB info x86_64 0:7.2-6.fc43 fedora 353.9 KiB patch x86_64 0:2.8-2.fc43 fedora 222.8 KiB redhat-rpm-config noarch 0:343-14.fc44 fedora 183.3 KiB rpm-build x86_64 0:6.0.0-1.fc44 fedora 287.4 KiB sed x86_64 0:4.9-6.fc44 fedora 857.3 KiB shadow-utils x86_64 2:4.18.0-3.fc43 fedora 3.9 MiB tar x86_64 2:1.35-6.fc43 fedora 2.9 MiB unzip x86_64 0:6.0-68.fc44 fedora 390.3 KiB util-linux x86_64 0:2.41.2-7.fc44 fedora 3.5 MiB which x86_64 0:2.23-3.fc43 fedora 83.5 KiB xz x86_64 1:5.8.1-2.fc43 fedora 1.3 MiB Installing dependencies: add-determinism x86_64 0:0.7.2-2.fc44 fedora 2.3 MiB alternatives x86_64 0:1.33-3.fc44 fedora 62.2 KiB ansible-srpm-macros noarch 0:1-18.1.fc43 fedora 35.7 KiB audit-libs x86_64 0:4.1.2-2.fc44 fedora 378.8 KiB binutils x86_64 0:2.45.50-9.fc44 copr_base 27.0 MiB build-reproducibility-srpm-macros noarch 0:0.7.2-2.fc44 fedora 1.2 KiB bzip2-libs x86_64 0:1.0.8-21.fc43 fedora 80.6 KiB ca-certificates noarch 0:2025.2.80_v9.0.304-2.fc44 fedora 2.7 MiB coreutils-common x86_64 0:9.8-3.fc44 fedora 11.1 MiB crypto-policies noarch 0:20250714-5.gitcd6043a.fc44 fedora 146.9 KiB curl x86_64 0:8.17.0-3.fc44 fedora 462.2 KiB cyrus-sasl-lib x86_64 0:2.1.28-33.fc44 fedora 2.3 MiB debugedit x86_64 0:5.2-3.fc44 fedora 214.0 KiB dwz x86_64 0:0.16-2.fc43 fedora 287.1 KiB ed x86_64 0:1.22.2-1.fc44 fedora 148.1 KiB efi-srpm-macros noarch 0:6-5.fc44 fedora 40.2 KiB elfutils x86_64 0:0.194-1.fc44 fedora 2.9 MiB elfutils-debuginfod-client x86_64 0:0.194-1.fc44 fedora 84.0 KiB elfutils-default-yama-scope noarch 0:0.194-1.fc44 fedora 1.8 KiB elfutils-libelf x86_64 0:0.194-1.fc44 fedora 1.1 MiB elfutils-libs x86_64 0:0.194-1.fc44 fedora 687.5 KiB fedora-gpg-keys noarch 0:44-0.1 fedora 131.2 KiB fedora-release noarch 0:44-0.5 fedora 0.0 B fedora-release-identity-basic noarch 0:44-0.5 fedora 664.0 B fedora-repos noarch 0:44-0.1 fedora 4.9 KiB fedora-repos-rawhide noarch 0:44-0.1 fedora 2.2 KiB file x86_64 0:5.46-8.fc44 fedora 100.2 KiB file-libs x86_64 0:5.46-8.fc44 fedora 11.9 MiB filesystem x86_64 0:3.18-50.fc43 fedora 112.0 B filesystem-srpm-macros noarch 0:3.18-50.fc43 fedora 38.2 KiB fonts-srpm-macros noarch 1:5.0.0-1.fc44 fedora 55.8 KiB forge-srpm-macros noarch 0:0.4.0-3.fc43 fedora 38.9 KiB fpc-srpm-macros noarch 0:1.3-15.fc43 fedora 144.0 B gap-srpm-macros noarch 0:2-1.fc44 fedora 2.1 KiB gdb-minimal x86_64 0:16.3-6.fc44 fedora 13.3 MiB gdbm-libs x86_64 1:1.23-10.fc43 fedora 129.9 KiB ghc-srpm-macros noarch 0:1.9.2-3.fc43 fedora 779.0 B glibc x86_64 0:2.42.9000-11.fc44 fedora 6.8 MiB glibc-common x86_64 0:2.42.9000-11.fc44 fedora 1.0 MiB glibc-gconv-extra x86_64 0:2.42.9000-11.fc44 fedora 7.2 MiB gmp x86_64 1:6.3.0-4.fc44 fedora 815.3 KiB gnat-srpm-macros noarch 0:6-8.fc43 fedora 1.0 KiB gnulib-l10n noarch 0:20241231-1.fc44 fedora 655.0 KiB gnupg2 x86_64 0:2.4.8-4.fc43 fedora 6.5 MiB gnupg2-dirmngr x86_64 0:2.4.8-4.fc43 fedora 618.4 KiB gnupg2-gpg-agent x86_64 0:2.4.8-4.fc43 fedora 671.4 KiB gnupg2-gpgconf x86_64 0:2.4.8-4.fc43 fedora 250.0 KiB gnupg2-keyboxd x86_64 0:2.4.8-4.fc43 fedora 201.4 KiB gnupg2-verify x86_64 0:2.4.8-4.fc43 fedora 348.5 KiB gnutls x86_64 0:3.8.10-5.fc44 fedora 3.8 MiB go-srpm-macros noarch 0:3.8.0-1.fc44 fedora 61.9 KiB gpgverify noarch 0:2.2-3.fc43 fedora 8.7 KiB ima-evm-utils-libs x86_64 0:1.6.2-7.fc44 fedora 60.7 KiB jansson x86_64 0:2.14-3.fc43 fedora 89.1 KiB java-srpm-macros noarch 0:1-7.fc43 fedora 870.0 B json-c x86_64 0:0.18-7.fc43 fedora 82.7 KiB kernel-srpm-macros noarch 0:1.0-27.fc43 fedora 1.9 KiB keyutils-libs x86_64 0:1.6.3-6.fc43 fedora 54.3 KiB krb5-libs x86_64 0:1.21.3-10.fc44 fedora 2.3 MiB libacl x86_64 0:2.3.2-4.fc43 fedora 35.9 KiB libarchive x86_64 0:3.8.2-1.fc44 fedora 955.2 KiB libassuan x86_64 0:2.5.7-4.fc43 fedora 163.8 KiB libattr x86_64 0:2.5.2-6.fc43 fedora 24.4 KiB libblkid x86_64 0:2.41.2-7.fc44 fedora 262.4 KiB libbrotli x86_64 0:1.1.0-10.fc44 fedora 833.3 KiB libcap x86_64 0:2.77-1.fc44 fedora 209.1 KiB libcap-ng x86_64 0:0.8.5-8.fc44 fedora 68.9 KiB libcom_err x86_64 0:1.47.3-3.fc44 fedora 63.1 KiB libcurl x86_64 0:8.17.0-3.fc44 fedora 927.3 KiB libeconf x86_64 0:0.7.9-2.fc43 fedora 64.9 KiB libevent x86_64 0:2.1.12-16.fc43 fedora 883.1 KiB libfdisk x86_64 0:2.41.2-7.fc44 fedora 380.4 KiB libffi x86_64 0:3.5.2-1.fc44 fedora 83.8 KiB libfsverity x86_64 0:1.6-3.fc43 fedora 28.5 KiB libgcc x86_64 0:15.2.1-4.fc44 copr_base 266.6 KiB libgcrypt x86_64 0:1.11.2-1.fc44 fedora 1.6 MiB libgomp x86_64 0:15.2.1-4.fc44 copr_base 541.6 KiB libgpg-error x86_64 0:1.56-1.fc44 fedora 916.6 KiB libidn2 x86_64 0:2.3.8-2.fc43 fedora 552.5 KiB libksba x86_64 0:1.6.7-4.fc43 fedora 398.5 KiB liblastlog2 x86_64 0:2.41.2-7.fc44 fedora 33.9 KiB libmount x86_64 0:2.41.2-7.fc44 fedora 372.7 KiB libnghttp2 x86_64 0:1.68.0-1.fc44 fedora 162.2 KiB libpkgconf x86_64 0:2.3.0-3.fc43 fedora 78.1 KiB libpsl x86_64 0:0.21.5-6.fc43 fedora 76.4 KiB libselinux x86_64 0:3.9-5.fc44 fedora 193.1 KiB libselinux-utils x86_64 0:3.9-5.fc44 fedora 309.0 KiB libsemanage x86_64 0:3.9-4.fc44 fedora 308.5 KiB libsepol x86_64 0:3.9-2.fc43 fedora 822.0 KiB libsmartcols x86_64 0:2.41.2-7.fc44 fedora 180.5 KiB libssh x86_64 0:0.11.3-1.fc44 fedora 567.1 KiB libssh-config noarch 0:0.11.3-1.fc44 fedora 277.0 B libstdc++ x86_64 0:15.2.1-4.fc44 copr_base 2.8 MiB libtasn1 x86_64 0:4.20.0-2.fc43 fedora 176.3 KiB libtool-ltdl x86_64 0:2.5.4-7.fc43 fedora 70.1 KiB libunistring x86_64 0:1.1-10.fc43 fedora 1.7 MiB libusb1 x86_64 0:1.0.29-4.fc44 fedora 171.3 KiB libuuid x86_64 0:2.41.2-7.fc44 fedora 37.3 KiB libverto x86_64 0:0.3.2-11.fc43 fedora 25.4 KiB libxcrypt x86_64 0:4.5.2-1.fc44 fedora 285.3 KiB libxml2 x86_64 0:2.12.10-5.fc44 fedora 1.7 MiB libzstd x86_64 0:1.5.7-3.fc44 fedora 940.3 KiB linkdupes x86_64 0:0.7.2-2.fc44 fedora 838.7 KiB lua-libs x86_64 0:5.4.8-3.fc44 fedora 280.8 KiB lua-srpm-macros noarch 0:1-16.fc43 fedora 1.3 KiB lz4-libs x86_64 0:1.10.0-3.fc43 fedora 161.4 KiB mpfr x86_64 0:4.2.2-2.fc43 fedora 832.8 KiB ncurses-base noarch 0:6.5-7.20250614.fc43 fedora 328.1 KiB ncurses-libs x86_64 0:6.5-7.20250614.fc43 fedora 946.3 KiB nettle x86_64 0:3.10.1-2.fc43 fedora 790.6 KiB npth x86_64 0:1.8-3.fc43 fedora 49.6 KiB ocaml-srpm-macros noarch 0:11-2.fc43 fedora 1.9 KiB openblas-srpm-macros noarch 0:2-20.fc43 fedora 112.0 B openldap x86_64 0:2.6.10-4.fc44 fedora 659.8 KiB openssl-libs x86_64 1:3.5.4-1.fc44 fedora 8.9 MiB p11-kit x86_64 0:0.25.8-1.fc44 fedora 2.3 MiB p11-kit-trust x86_64 0:0.25.8-1.fc44 fedora 446.5 KiB package-notes-srpm-macros noarch 0:0.5-14.fc43 fedora 1.6 KiB pam-libs x86_64 0:1.7.1-3.fc43 fedora 126.8 KiB pcre2 x86_64 0:10.47-1.fc44 fedora 702.6 KiB pcre2-syntax noarch 0:10.47-1.fc44 fedora 281.9 KiB perl-srpm-macros noarch 0:1-60.fc43 fedora 861.0 B pkgconf x86_64 0:2.3.0-3.fc43 fedora 88.5 KiB pkgconf-m4 noarch 0:2.3.0-3.fc43 fedora 14.4 KiB pkgconf-pkg-config x86_64 0:2.3.0-3.fc43 fedora 989.0 B policycoreutils x86_64 0:3.9-5.fc44 fedora 683.5 KiB popt x86_64 0:1.19-9.fc43 fedora 132.8 KiB publicsuffix-list-dafsa noarch 0:20250616-2.fc43 fedora 69.1 KiB pyproject-srpm-macros noarch 0:1.18.5-1.fc44 fedora 1.9 KiB python-srpm-macros noarch 0:3.14-9.fc44 fedora 51.6 KiB qt5-srpm-macros noarch 0:5.15.18-1.fc44 fedora 500.0 B qt6-srpm-macros noarch 0:6.10.0-1.fc44 fedora 464.0 B readline x86_64 0:8.3-2.fc43 fedora 511.7 KiB rpm x86_64 0:6.0.0-1.fc44 fedora 3.1 MiB rpm-build-libs x86_64 0:6.0.0-1.fc44 fedora 268.4 KiB rpm-libs x86_64 0:6.0.0-1.fc44 fedora 933.8 KiB rpm-plugin-selinux x86_64 0:6.0.0-1.fc44 fedora 12.0 KiB rpm-sequoia x86_64 0:1.9.0-2.fc43 fedora 2.5 MiB rpm-sign-libs x86_64 0:6.0.0-1.fc44 fedora 39.7 KiB rust-srpm-macros noarch 0:26.4-1.fc44 fedora 4.8 KiB selinux-policy noarch 0:42.15-1.fc44 fedora 32.0 KiB selinux-policy-targeted noarch 0:42.15-1.fc44 fedora 18.7 MiB setup noarch 0:2.15.0-27.fc44 fedora 724.9 KiB sqlite-libs x86_64 0:3.51.0-1.fc44 fedora 1.5 MiB systemd-libs x86_64 0:258.2-1.fc44 fedora 2.3 MiB systemd-standalone-sysusers x86_64 0:258.2-1.fc44 fedora 293.5 KiB tpm2-tss x86_64 0:4.1.3-8.fc43 fedora 1.6 MiB tree-sitter-srpm-macros noarch 0:0.4.2-1.fc43 fedora 8.3 KiB util-linux-core x86_64 0:2.41.2-7.fc44 fedora 1.5 MiB xxhash-libs x86_64 0:0.8.3-3.fc43 fedora 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc43 fedora 217.8 KiB zig-srpm-macros noarch 0:1-5.fc43 fedora 1.1 KiB zip x86_64 0:3.0-44.fc43 fedora 694.5 KiB zlib-ng-compat x86_64 0:2.2.5-2.fc44 fedora 137.6 KiB zstd x86_64 0:1.5.7-3.fc44 fedora 506.2 KiB Installing groups: Buildsystem building group Transaction Summary: Installing: 177 packages Total size of inbound packages is 67 MiB. Need to download 67 MiB. After this operation, 219 MiB extra will be used (install 219 MiB, remove 0 B). [ 1/177] bzip2-0:1.0.8-21.fc43.x86_64 100% | 243.5 KiB/s | 51.6 KiB | 00m00s [ 2/177] cpio-0:2.15-6.fc43.x86_64 100% | 1.8 MiB/s | 293.1 KiB | 00m00s [ 3/177] coreutils-0:9.8-3.fc44.x86_64 100% | 2.7 MiB/s | 1.1 MiB | 00m00s [ 4/177] fedora-release-common-0:44-0. 100% | 650.1 KiB/s | 24.7 KiB | 00m00s [ 5/177] diffutils-0:3.12-3.fc43.x86_6 100% | 3.7 MiB/s | 392.3 KiB | 00m00s [ 6/177] bash-0:5.3.0-2.fc43.x86_64 100% | 3.8 MiB/s | 1.9 MiB | 00m00s [ 7/177] glibc-minimal-langpack-0:2.42 100% | 1.4 MiB/s | 60.9 KiB | 00m00s [ 8/177] findutils-1:4.10.0-6.fc43.x86 100% | 9.4 MiB/s | 550.0 KiB | 00m00s [ 9/177] grep-0:3.12-2.fc43.x86_64 100% | 6.3 MiB/s | 299.1 KiB | 00m00s [ 10/177] info-0:7.2-6.fc43.x86_64 100% | 4.2 MiB/s | 182.9 KiB | 00m00s [ 11/177] gzip-0:1.14-1.fc44.x86_64 100% | 3.2 MiB/s | 177.7 KiB | 00m00s [ 12/177] patch-0:2.8-2.fc43.x86_64 100% | 2.6 MiB/s | 113.8 KiB | 00m00s [ 13/177] redhat-rpm-config-0:343-14.fc 100% | 1.9 MiB/s | 79.2 KiB | 00m00s [ 14/177] rpm-build-0:6.0.0-1.fc44.x86_ 100% | 2.7 MiB/s | 138.0 KiB | 00m00s [ 15/177] sed-0:4.9-6.fc44.x86_64 100% | 6.6 MiB/s | 317.1 KiB | 00m00s [ 16/177] unzip-0:6.0-68.fc44.x86_64 100% | 4.2 MiB/s | 184.6 KiB | 00m00s [ 17/177] shadow-utils-2:4.18.0-3.fc43. 100% | 15.1 MiB/s | 1.3 MiB | 00m00s [ 18/177] which-0:2.23-3.fc43.x86_64 100% | 1.0 MiB/s | 41.7 KiB | 00m00s [ 19/177] tar-2:1.35-6.fc43.x86_64 100% | 7.4 MiB/s | 856.4 KiB | 00m00s [ 20/177] xz-1:5.8.1-2.fc43.x86_64 100% | 12.7 MiB/s | 572.5 KiB | 00m00s [ 21/177] filesystem-0:3.18-50.fc43.x86 100% | 24.7 MiB/s | 1.3 MiB | 00m00s [ 22/177] gawk-0:5.3.2-2.fc43.x86_64 100% | 13.4 MiB/s | 1.1 MiB | 00m00s [ 23/177] util-linux-0:2.41.2-7.fc44.x8 100% | 12.4 MiB/s | 1.2 MiB | 00m00s [ 24/177] ncurses-libs-0:6.5-7.20250614 100% | 7.6 MiB/s | 332.7 KiB | 00m00s [ 25/177] bzip2-libs-0:1.0.8-21.fc43.x8 100% | 1.0 MiB/s | 43.1 KiB | 00m00s [ 26/177] glibc-0:2.42.9000-11.fc44.x86 100% | 28.8 MiB/s | 2.3 MiB | 00m00s [ 27/177] libacl-0:2.3.2-4.fc43.x86_64 100% | 638.9 KiB/s | 24.3 KiB | 00m00s [ 28/177] gmp-1:6.3.0-4.fc44.x86_64 100% | 6.8 MiB/s | 319.3 KiB | 00m00s [ 29/177] coreutils-common-0:9.8-3.fc44 100% | 24.4 MiB/s | 2.1 MiB | 00m00s [ 30/177] libattr-0:2.5.2-6.fc43.x86_64 100% | 469.8 KiB/s | 17.9 KiB | 00m00s [ 31/177] libcap-0:2.77-1.fc44.x86_64 100% | 2.0 MiB/s | 87.1 KiB | 00m00s [ 32/177] libselinux-0:3.9-5.fc44.x86_6 100% | 2.4 MiB/s | 97.8 KiB | 00m00s [ 33/177] openssl-libs-1:3.5.4-1.fc44.x 100% | 46.7 MiB/s | 2.6 MiB | 00m00s [ 34/177] fedora-repos-0:44-0.1.noarch 100% | 232.6 KiB/s | 9.1 KiB | 00m00s [ 35/177] systemd-libs-0:258.2-1.fc44.x 100% | 14.5 MiB/s | 818.4 KiB | 00m00s [ 36/177] glibc-common-0:2.42.9000-11.f 100% | 8.5 MiB/s | 348.0 KiB | 00m00s [ 37/177] pcre2-0:10.47-1.fc44.x86_64 100% | 6.4 MiB/s | 267.2 KiB | 00m00s [ 38/177] ed-0:1.22.2-1.fc44.x86_64 100% | 1.9 MiB/s | 83.7 KiB | 00m00s [ 39/177] ansible-srpm-macros-0:1-18.1. 100% | 523.9 KiB/s | 19.9 KiB | 00m00s [ 40/177] build-reproducibility-srpm-ma 100% | 329.7 KiB/s | 12.9 KiB | 00m00s [ 41/177] dwz-0:0.16-2.fc43.x86_64 100% | 3.1 MiB/s | 135.5 KiB | 00m00s [ 42/177] efi-srpm-macros-0:6-5.fc44.no 100% | 592.8 KiB/s | 22.5 KiB | 00m00s [ 43/177] file-0:5.46-8.fc44.x86_64 100% | 1.2 MiB/s | 48.8 KiB | 00m00s [ 44/177] filesystem-srpm-macros-0:3.18 100% | 644.2 KiB/s | 26.4 KiB | 00m00s [ 45/177] fonts-srpm-macros-1:5.0.0-1.f 100% | 718.2 KiB/s | 27.3 KiB | 00m00s [ 46/177] forge-srpm-macros-0:0.4.0-3.f 100% | 515.1 KiB/s | 20.1 KiB | 00m00s [ 47/177] fpc-srpm-macros-0:1.3-15.fc43 100% | 192.5 KiB/s | 7.9 KiB | 00m00s [ 48/177] gap-srpm-macros-0:2-1.fc44.no 100% | 238.3 KiB/s | 9.1 KiB | 00m00s [ 49/177] ghc-srpm-macros-0:1.9.2-3.fc4 100% | 224.3 KiB/s | 8.7 KiB | 00m00s [ 50/177] gnat-srpm-macros-0:6-8.fc43.n 100% | 207.0 KiB/s | 8.5 KiB | 00m00s [ 51/177] go-srpm-macros-0:3.8.0-1.fc44 100% | 744.9 KiB/s | 28.3 KiB | 00m00s [ 52/177] java-srpm-macros-0:1-7.fc43.n 100% | 203.7 KiB/s | 7.9 KiB | 00m00s [ 53/177] kernel-srpm-macros-0:1.0-27.f 100% | 217.6 KiB/s | 8.9 KiB | 00m00s [ 54/177] lua-srpm-macros-0:1-16.fc43.n 100% | 230.4 KiB/s | 8.8 KiB | 00m00s [ 55/177] ocaml-srpm-macros-0:11-2.fc43 100% | 237.5 KiB/s | 9.3 KiB | 00m00s [ 56/177] openblas-srpm-macros-0:2-20.f 100% | 185.2 KiB/s | 7.6 KiB | 00m00s [ 57/177] package-notes-srpm-macros-0:0 100% | 236.5 KiB/s | 9.0 KiB | 00m00s [ 58/177] perl-srpm-macros-0:1-60.fc43. 100% | 212.5 KiB/s | 8.3 KiB | 00m00s [ 59/177] pyproject-srpm-macros-0:1.18. 100% | 322.6 KiB/s | 13.2 KiB | 00m00s [ 60/177] python-srpm-macros-0:3.14-9.f 100% | 626.5 KiB/s | 23.8 KiB | 00m00s [ 61/177] qt5-srpm-macros-0:5.15.18-1.f 100% | 220.6 KiB/s | 8.6 KiB | 00m00s [ 62/177] qt6-srpm-macros-0:6.10.0-1.fc 100% | 228.3 KiB/s | 9.4 KiB | 00m00s [ 63/177] rpm-0:6.0.0-1.fc44.x86_64 100% | 13.7 MiB/s | 576.6 KiB | 00m00s [ 64/177] rust-srpm-macros-0:26.4-1.fc4 100% | 286.2 KiB/s | 11.2 KiB | 00m00s [ 65/177] zig-srpm-macros-0:1-5.fc43.no 100% | 228.0 KiB/s | 8.4 KiB | 00m00s [ 66/177] tree-sitter-srpm-macros-0:0.4 100% | 325.6 KiB/s | 13.4 KiB | 00m00s [ 67/177] zip-0:3.0-44.fc43.x86_64 100% | 6.2 MiB/s | 261.6 KiB | 00m00s [ 68/177] debugedit-0:5.2-3.fc44.x86_64 100% | 2.2 MiB/s | 85.6 KiB | 00m00s [ 69/177] elfutils-0:0.194-1.fc44.x86_6 100% | 11.0 MiB/s | 575.8 KiB | 00m00s [ 70/177] elfutils-libelf-0:0.194-1.fc4 100% | 5.0 MiB/s | 205.3 KiB | 00m00s [ 71/177] libarchive-0:3.8.2-1.fc44.x86 100% | 10.3 MiB/s | 422.2 KiB | 00m00s [ 72/177] popt-0:1.19-9.fc43.x86_64 100% | 1.5 MiB/s | 65.7 KiB | 00m00s [ 73/177] readline-0:8.3-2.fc43.x86_64 100% | 5.3 MiB/s | 224.6 KiB | 00m00s [ 74/177] rpm-build-libs-0:6.0.0-1.fc44 100% | 3.3 MiB/s | 127.9 KiB | 00m00s [ 75/177] zstd-0:1.5.7-3.fc44.x86_64 100% | 4.6 MiB/s | 189.5 KiB | 00m00s [ 76/177] rpm-libs-0:6.0.0-1.fc44.x86_6 100% | 8.1 MiB/s | 400.5 KiB | 00m00s [ 77/177] audit-libs-0:4.1.2-2.fc44.x86 100% | 3.6 MiB/s | 138.4 KiB | 00m00s [ 78/177] libeconf-0:0.7.9-2.fc43.x86_6 100% | 902.8 KiB/s | 35.2 KiB | 00m00s [ 79/177] libsemanage-0:3.9-4.fc44.x86_ 100% | 2.8 MiB/s | 123.5 KiB | 00m00s [ 80/177] libxcrypt-0:4.5.2-1.fc44.x86_ 100% | 3.3 MiB/s | 128.1 KiB | 00m00s [ 81/177] pam-libs-0:1.7.1-3.fc43.x86_6 100% | 1.4 MiB/s | 57.5 KiB | 00m00s [ 82/177] setup-0:2.15.0-27.fc44.noarch 100% | 3.5 MiB/s | 157.4 KiB | 00m00s [ 83/177] xz-libs-1:5.8.1-2.fc43.x86_64 100% | 2.9 MiB/s | 112.9 KiB | 00m00s [ 84/177] mpfr-0:4.2.2-2.fc43.x86_64 100% | 8.1 MiB/s | 347.0 KiB | 00m00s [ 85/177] libcap-ng-0:0.8.5-8.fc44.x86_ 100% | 870.2 KiB/s | 32.2 KiB | 00m00s [ 86/177] libblkid-0:2.41.2-7.fc44.x86_ 100% | 2.8 MiB/s | 123.2 KiB | 00m00s [ 87/177] libfdisk-0:2.41.2-7.fc44.x86_ 100% | 4.0 MiB/s | 162.0 KiB | 00m00s [ 88/177] liblastlog2-0:2.41.2-7.fc44.x 100% | 612.4 KiB/s | 23.3 KiB | 00m00s [ 89/177] libmount-0:2.41.2-7.fc44.x86_ 100% | 3.7 MiB/s | 162.7 KiB | 00m00s [ 90/177] libsmartcols-0:2.41.2-7.fc44. 100% | 2.1 MiB/s | 84.1 KiB | 00m00s [ 91/177] libuuid-0:2.41.2-7.fc44.x86_6 100% | 693.4 KiB/s | 26.3 KiB | 00m00s [ 92/177] util-linux-core-0:2.41.2-7.fc 100% | 10.5 MiB/s | 550.8 KiB | 00m00s [ 93/177] zlib-ng-compat-0:2.2.5-2.fc44 100% | 2.0 MiB/s | 79.2 KiB | 00m00s [ 94/177] glibc-gconv-extra-0:2.42.9000 100% | 32.8 MiB/s | 1.6 MiB | 00m00s [ 95/177] ncurses-base-0:6.5-7.20250614 100% | 2.1 MiB/s | 88.2 KiB | 00m00s [ 96/177] gnulib-l10n-0:20241231-1.fc44 100% | 3.7 MiB/s | 150.2 KiB | 00m00s [ 97/177] libsepol-0:3.9-2.fc43.x86_64 100% | 8.4 MiB/s | 345.4 KiB | 00m00s [ 98/177] crypto-policies-0:20250714-5. 100% | 2.4 MiB/s | 98.5 KiB | 00m00s [ 99/177] ca-certificates-0:2025.2.80_v 100% | 16.4 MiB/s | 973.8 KiB | 00m00s [100/177] fedora-gpg-keys-0:44-0.1.noar 100% | 3.6 MiB/s | 138.8 KiB | 00m00s [101/177] fedora-repos-rawhide-0:44-0.1 100% | 221.7 KiB/s | 8.6 KiB | 00m00s [102/177] pcre2-syntax-0:10.47-1.fc44.n 100% | 3.7 MiB/s | 164.7 KiB | 00m00s [103/177] add-determinism-0:0.7.2-2.fc4 100% | 20.2 MiB/s | 887.6 KiB | 00m00s [104/177] linkdupes-0:0.7.2-2.fc44.x86_ 100% | 8.3 MiB/s | 356.3 KiB | 00m00s [105/177] curl-0:8.17.0-3.fc44.x86_64 100% | 5.8 MiB/s | 232.4 KiB | 00m00s [106/177] file-libs-0:5.46-8.fc44.x86_6 100% | 15.1 MiB/s | 849.9 KiB | 00m00s [107/177] elfutils-libs-0:0.194-1.fc44. 100% | 6.5 MiB/s | 271.6 KiB | 00m00s [108/177] elfutils-debuginfod-client-0: 100% | 1.2 MiB/s | 46.9 KiB | 00m00s [109/177] libzstd-0:1.5.7-3.fc44.x86_64 100% | 7.5 MiB/s | 359.1 KiB | 00m00s [110/177] libxml2-0:2.12.10-5.fc44.x86_ 100% | 15.4 MiB/s | 692.7 KiB | 00m00s [111/177] lz4-libs-0:1.10.0-3.fc43.x86_ 100% | 2.0 MiB/s | 78.0 KiB | 00m00s [112/177] lua-libs-0:5.4.8-3.fc44.x86_6 100% | 3.0 MiB/s | 131.9 KiB | 00m00s [113/177] rpm-sign-libs-0:6.0.0-1.fc44. 100% | 723.5 KiB/s | 28.2 KiB | 00m00s [114/177] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 20.7 MiB/s | 933.3 KiB | 00m00s [115/177] elfutils-default-yama-scope-0 100% | 317.3 KiB/s | 12.4 KiB | 00m00s [116/177] sqlite-libs-0:3.51.0-1.fc44.x 100% | 13.9 MiB/s | 766.5 KiB | 00m00s [117/177] json-c-0:0.18-7.fc43.x86_64 100% | 1.2 MiB/s | 45.0 KiB | 00m00s [118/177] ima-evm-utils-libs-0:1.6.2-7. 100% | 717.8 KiB/s | 29.4 KiB | 00m00s [119/177] libfsverity-0:1.6-3.fc43.x86_ 100% | 503.5 KiB/s | 18.6 KiB | 00m00s [120/177] gnupg2-0:2.4.8-4.fc43.x86_64 100% | 31.0 MiB/s | 1.6 MiB | 00m00s [121/177] gnupg2-dirmngr-0:2.4.8-4.fc43 100% | 6.9 MiB/s | 274.6 KiB | 00m00s [122/177] gpgverify-0:2.2-3.fc43.noarch 100% | 270.8 KiB/s | 11.1 KiB | 00m00s [123/177] gnupg2-gpg-agent-0:2.4.8-4.fc 100% | 6.5 MiB/s | 272.9 KiB | 00m00s [124/177] gnupg2-gpgconf-0:2.4.8-4.fc43 100% | 3.0 MiB/s | 115.0 KiB | 00m00s [125/177] gnupg2-keyboxd-0:2.4.8-4.fc43 100% | 2.2 MiB/s | 94.7 KiB | 00m00s [126/177] gnupg2-verify-0:2.4.8-4.fc43. 100% | 4.2 MiB/s | 171.2 KiB | 00m00s [127/177] libassuan-0:2.5.7-4.fc43.x86_ 100% | 1.7 MiB/s | 67.4 KiB | 00m00s [128/177] libgpg-error-0:1.56-1.fc44.x8 100% | 5.9 MiB/s | 245.7 KiB | 00m00s [129/177] libgcrypt-0:1.11.2-1.fc44.x86 100% | 11.4 MiB/s | 596.1 KiB | 00m00s [130/177] npth-0:1.8-3.fc43.x86_64 100% | 675.2 KiB/s | 25.7 KiB | 00m00s [131/177] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 9.9 MiB/s | 425.9 KiB | 00m00s [132/177] libksba-0:1.6.7-4.fc43.x86_64 100% | 4.1 MiB/s | 160.4 KiB | 00m00s [133/177] gnutls-0:3.8.10-5.fc44.x86_64 100% | 21.6 MiB/s | 1.4 MiB | 00m00s [134/177] openldap-0:2.6.10-4.fc44.x86_ 100% | 6.2 MiB/s | 259.5 KiB | 00m00s [135/177] libusb1-0:1.0.29-4.fc44.x86_6 100% | 2.1 MiB/s | 79.9 KiB | 00m00s [136/177] libidn2-0:2.3.8-2.fc43.x86_64 100% | 3.9 MiB/s | 174.9 KiB | 00m00s [137/177] libtasn1-0:4.20.0-2.fc43.x86_ 100% | 1.9 MiB/s | 74.5 KiB | 00m00s [138/177] libunistring-0:1.1-10.fc43.x8 100% | 12.9 MiB/s | 542.9 KiB | 00m00s [139/177] nettle-0:3.10.1-2.fc43.x86_64 100% | 8.8 MiB/s | 424.2 KiB | 00m00s [140/177] p11-kit-0:0.25.8-1.fc44.x86_6 100% | 11.6 MiB/s | 510.0 KiB | 00m00s [141/177] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 18.1 MiB/s | 796.5 KiB | 00m00s [142/177] libevent-0:2.1.12-16.fc43.x86 100% | 5.6 MiB/s | 257.8 KiB | 00m00s [143/177] libtool-ltdl-0:2.5.4-7.fc43.x 100% | 929.1 KiB/s | 36.2 KiB | 00m00s [144/177] libffi-0:3.5.2-1.fc44.x86_64 100% | 1.1 MiB/s | 41.1 KiB | 00m00s [145/177] gdbm-libs-1:1.23-10.fc43.x86_ 100% | 1.3 MiB/s | 56.8 KiB | 00m00s [146/177] libgcc-0:15.2.1-4.fc44.x86_64 100% | 2.5 MiB/s | 134.6 KiB | 00m00s [147/177] libgomp-0:15.2.1-4.fc44.x86_6 100% | 6.8 MiB/s | 375.0 KiB | 00m00s [148/177] libstdc++-0:15.2.1-4.fc44.x86 100% | 8.3 MiB/s | 921.9 KiB | 00m00s [149/177] alternatives-0:1.33-3.fc44.x8 100% | 1.0 MiB/s | 40.8 KiB | 00m00s [150/177] jansson-0:2.14-3.fc43.x86_64 100% | 1.1 MiB/s | 45.3 KiB | 00m00s [151/177] pkgconf-pkg-config-0:2.3.0-3. 100% | 246.4 KiB/s | 9.6 KiB | 00m00s [152/177] pkgconf-0:2.3.0-3.fc43.x86_64 100% | 1.0 MiB/s | 44.6 KiB | 00m00s [153/177] pkgconf-m4-0:2.3.0-3.fc43.noa 100% | 356.7 KiB/s | 13.9 KiB | 00m00s [154/177] libpkgconf-0:2.3.0-3.fc43.x86 100% | 924.3 KiB/s | 37.9 KiB | 00m00s [155/177] p11-kit-trust-0:0.25.8-1.fc44 100% | 3.4 MiB/s | 139.7 KiB | 00m00s [156/177] fedora-release-0:44-0.5.noarc 100% | 330.6 KiB/s | 13.6 KiB | 00m00s [157/177] systemd-standalone-sysusers-0 100% | 3.4 MiB/s | 141.1 KiB | 00m00s [158/177] binutils-0:2.45.50-9.fc44.x86 100% | 21.4 MiB/s | 5.9 MiB | 00m00s [159/177] xxhash-libs-0:0.8.3-3.fc43.x8 100% | 986.8 KiB/s | 38.5 KiB | 00m00s [160/177] fedora-release-identity-basic 100% | 387.1 KiB/s | 14.3 KiB | 00m00s [161/177] libcurl-0:8.17.0-3.fc44.x86_6 100% | 9.6 MiB/s | 412.4 KiB | 00m00s [162/177] gdb-minimal-0:16.3-6.fc44.x86 100% | 38.7 MiB/s | 4.4 MiB | 00m00s [163/177] krb5-libs-0:1.21.3-10.fc44.x8 100% | 16.9 MiB/s | 761.1 KiB | 00m00s [164/177] libbrotli-0:1.1.0-10.fc44.x86 100% | 7.7 MiB/s | 339.1 KiB | 00m00s [165/177] libpsl-0:0.21.5-6.fc43.x86_64 100% | 1.7 MiB/s | 65.0 KiB | 00m00s [166/177] libnghttp2-0:1.68.0-1.fc44.x8 100% | 1.7 MiB/s | 72.8 KiB | 00m00s [167/177] libssh-0:0.11.3-1.fc44.x86_64 100% | 5.5 MiB/s | 232.8 KiB | 00m00s [168/177] keyutils-libs-0:1.6.3-6.fc43. 100% | 825.0 KiB/s | 31.4 KiB | 00m00s [169/177] libcom_err-0:1.47.3-3.fc44.x8 100% | 656.8 KiB/s | 26.9 KiB | 00m00s [170/177] libverto-0:0.3.2-11.fc43.x86_ 100% | 530.1 KiB/s | 20.7 KiB | 00m00s [171/177] publicsuffix-list-dafsa-0:202 100% | 1.5 MiB/s | 59.2 KiB | 00m00s [172/177] libssh-config-0:0.11.3-1.fc44 100% | 222.2 KiB/s | 9.1 KiB | 00m00s [173/177] policycoreutils-0:3.9-5.fc44. 100% | 5.4 MiB/s | 214.6 KiB | 00m00s [174/177] selinux-policy-0:42.15-1.fc44 100% | 1.5 MiB/s | 63.4 KiB | 00m00s [175/177] libselinux-utils-0:3.9-5.fc44 100% | 3.1 MiB/s | 119.3 KiB | 00m00s [176/177] rpm-plugin-selinux-0:6.0.0-1. 100% | 475.4 KiB/s | 19.5 KiB | 00m00s [177/177] selinux-policy-targeted-0:42. 100% | 54.8 MiB/s | 6.8 MiB | 00m00s -------------------------------------------------------------------------------- [177/177] Total 100% | 21.0 MiB/s | 66.6 MiB | 00m03s Running transaction Importing OpenPGP key 0x6D9F90A6: UserID : "Fedora (44) " Fingerprint: 36F612DCF27F7D1A48A835E4DBFCF71C6D9F90A6 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-44-primary The key was successfully imported. Importing OpenPGP key 0x6D9F90A6: UserID : "Fedora (44) " Fingerprint: 36F612DCF27F7D1A48A835E4DBFCF71C6D9F90A6 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-44-primary The key was successfully imported. Importing OpenPGP key 0x31645531: UserID : "Fedora (43) " Fingerprint: C6E7F081CF80E13146676E88829B606631645531 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary The key was successfully imported. Importing OpenPGP key 0xF577861E: UserID : "Fedora (45) " Fingerprint: 4F50A6114CD5C6976A7F1179655A4B02F577861E From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-45-primary The key was successfully imported. [ 1/179] Verify package files 100% | 756.0 B/s | 177.0 B | 00m00s >>> Running %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> Finished %pretrans scriptlet: filesystem-0:3.18-50.fc43.x86_64 >>> [RPM] /var/lib/mock/fedora-rawhide-x86_64-1763473467.053604/root/var/cache/dnf/copr_base-6729cd57cc4ff6a8/packages/libgcc-15.2.1-4.fc44.x86_64.rpm: Header OpenPGP V4 RSA/SHA256 signature, key ID 4d0ea48a8d983303: NOKEY [ 2/179] Prepare transaction 100% | 3.8 KiB/s | 177.0 B | 00m00s [ 3/179] Installing libgcc-0:15.2.1-4. 100% | 262.0 MiB/s | 268.3 KiB | 00m00s [ 4/179] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 5/179] Installing publicsuffix-list- 100% | 0.0 B/s | 69.8 KiB | 00m00s [ 6/179] Installing fedora-release-ide 100% | 0.0 B/s | 920.0 B | 00m00s [ 7/179] Installing fedora-gpg-keys-0: 100% | 58.3 MiB/s | 179.0 KiB | 00m00s [ 8/179] Installing fedora-repos-rawhi 100% | 0.0 B/s | 2.4 KiB | 00m00s [ 9/179] Installing fedora-repos-0:44- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 10/179] Installing fedora-release-com 100% | 24.3 MiB/s | 24.9 KiB | 00m00s [ 11/179] Installing fedora-release-0:4 100% | 20.2 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-27.fc44.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-27.fc44.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating group 'bin' with GID 1. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating group 'daemon' with GID 2. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 100. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 12/179] Installing setup-0:2.15.0-27. 100% | 59.5 MiB/s | 730.6 KiB | 00m00s >>> [RPM] /etc/hosts created as /etc/hosts.rpmnew [ 13/179] Installing filesystem-0:3.18- 100% | 3.1 MiB/s | 212.8 KiB | 00m00s [ 14/179] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [ 15/179] Installing pcre2-syntax-0:10. 100% | 277.7 MiB/s | 284.3 KiB | 00m00s [ 16/179] Installing gnulib-l10n-0:2024 100% | 215.5 MiB/s | 661.9 KiB | 00m00s [ 17/179] Installing coreutils-common-0 100% | 447.0 MiB/s | 11.2 MiB | 00m00s [ 18/179] Installing ncurses-base-0:6.5 100% | 115.1 MiB/s | 353.5 KiB | 00m00s [ 19/179] Installing bash-0:5.3.0-2.fc4 100% | 312.2 MiB/s | 8.4 MiB | 00m00s [ 20/179] Installing glibc-common-0:2.4 100% | 73.0 MiB/s | 1.0 MiB | 00m00s [ 21/179] Installing glibc-gconv-extra- 100% | 318.1 MiB/s | 7.3 MiB | 00m00s [ 22/179] Installing glibc-0:2.42.9000- 100% | 213.4 MiB/s | 6.8 MiB | 00m00s [ 23/179] Installing ncurses-libs-0:6.5 100% | 310.1 MiB/s | 952.8 KiB | 00m00s [ 24/179] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 25/179] Installing zlib-ng-compat-0:2 100% | 0.0 B/s | 138.4 KiB | 00m00s [ 26/179] Installing bzip2-libs-0:1.0.8 100% | 0.0 B/s | 81.7 KiB | 00m00s [ 27/179] Installing libgpg-error-0:1.5 100% | 69.3 MiB/s | 922.5 KiB | 00m00s [ 28/179] Installing libstdc++-0:15.2.1 100% | 406.3 MiB/s | 2.8 MiB | 00m00s [ 29/179] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB | 00m00s [ 30/179] Installing libgcrypt-0:1.11.2 100% | 394.0 MiB/s | 1.6 MiB | 00m00s [ 31/179] Installing readline-0:8.3-2.f 100% | 501.8 MiB/s | 513.9 KiB | 00m00s [ 32/179] Installing gmp-1:6.3.0-4.fc44 100% | 399.2 MiB/s | 817.5 KiB | 00m00s [ 33/179] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB | 00m00s [ 34/179] Installing libuuid-0:2.41.2-7 100% | 0.0 B/s | 38.5 KiB | 00m00s [ 35/179] Installing popt-0:1.19-9.fc43 100% | 68.1 MiB/s | 139.4 KiB | 00m00s [ 36/179] Installing libzstd-0:1.5.7-3. 100% | 459.7 MiB/s | 941.6 KiB | 00m00s [ 37/179] Installing elfutils-libelf-0: 100% | 560.5 MiB/s | 1.1 MiB | 00m00s [ 38/179] Installing npth-0:1.8-3.fc43. 100% | 0.0 B/s | 50.7 KiB | 00m00s [ 39/179] Installing libblkid-0:2.41.2- 100% | 257.4 MiB/s | 263.5 KiB | 00m00s [ 40/179] Installing libxcrypt-0:4.5.2- 100% | 281.3 MiB/s | 288.0 KiB | 00m00s [ 41/179] Installing libsepol-0:3.9-2.f 100% | 401.8 MiB/s | 822.9 KiB | 00m00s [ 42/179] Installing sqlite-libs-0:3.51 100% | 383.0 MiB/s | 1.5 MiB | 00m00s [ 43/179] Installing gnupg2-gpgconf-0:2 100% | 22.4 MiB/s | 252.0 KiB | 00m00s [ 44/179] Installing libattr-0:2.5.2-6. 100% | 0.0 B/s | 25.4 KiB | 00m00s [ 45/179] Installing libacl-0:2.3.2-4.f 100% | 0.0 B/s | 36.8 KiB | 00m00s [ 46/179] Installing pcre2-0:10.47-1.fc 100% | 343.8 MiB/s | 704.1 KiB | 00m00s [ 47/179] Installing libselinux-0:3.9-5 100% | 189.8 MiB/s | 194.4 KiB | 00m00s [ 48/179] Installing grep-0:3.12-2.fc43 100% | 71.6 MiB/s | 1.0 MiB | 00m00s [ 49/179] Installing sed-0:4.9-6.fc44.x 100% | 65.0 MiB/s | 865.5 KiB | 00m00s [ 50/179] Installing findutils-1:4.10.0 100% | 123.9 MiB/s | 1.9 MiB | 00m00s [ 51/179] Installing libtasn1-0:4.20.0- 100% | 173.9 MiB/s | 178.1 KiB | 00m00s [ 52/179] Installing libunistring-0:1.1 100% | 431.7 MiB/s | 1.7 MiB | 00m00s [ 53/179] Installing libidn2-0:2.3.8-2. 100% | 68.2 MiB/s | 558.7 KiB | 00m00s [ 54/179] Installing crypto-policies-0: 100% | 42.0 MiB/s | 172.0 KiB | 00m00s [ 55/179] Installing xz-1:5.8.1-2.fc43. 100% | 83.2 MiB/s | 1.3 MiB | 00m00s [ 56/179] Installing libmount-0:2.41.2- 100% | 365.0 MiB/s | 373.8 KiB | 00m00s [ 57/179] Installing gnupg2-verify-0:2. 100% | 31.1 MiB/s | 349.9 KiB | 00m00s [ 58/179] Installing dwz-0:0.16-2.fc43. 100% | 25.6 MiB/s | 288.5 KiB | 00m00s [ 59/179] Installing mpfr-0:4.2.2-2.fc4 100% | 407.4 MiB/s | 834.4 KiB | 00m00s [ 60/179] Installing gawk-0:5.3.2-2.fc4 100% | 113.5 MiB/s | 1.8 MiB | 00m00s [ 61/179] Installing libksba-0:1.6.7-4. 100% | 391.7 MiB/s | 401.1 KiB | 00m00s [ 62/179] Installing unzip-0:6.0-68.fc4 100% | 35.0 MiB/s | 393.8 KiB | 00m00s [ 63/179] Installing file-libs-0:5.46-8 100% | 741.1 MiB/s | 11.9 MiB | 00m00s [ 64/179] Installing file-0:5.46-8.fc44 100% | 9.0 MiB/s | 101.7 KiB | 00m00s [ 65/179] Installing diffutils-0:3.12-3 100% | 111.5 MiB/s | 1.6 MiB | 00m00s [ 66/179] Installing libeconf-0:0.7.9-2 100% | 65.0 MiB/s | 66.5 KiB | 00m00s [ 67/179] Installing libcap-ng-0:0.8.5- 100% | 69.2 MiB/s | 70.8 KiB | 00m00s [ 68/179] Installing audit-libs-0:4.1.2 100% | 372.6 MiB/s | 381.5 KiB | 00m00s [ 69/179] Installing pam-libs-0:1.7.1-3 100% | 126.0 MiB/s | 129.0 KiB | 00m00s [ 70/179] Installing libcap-0:2.77-1.fc 100% | 19.0 MiB/s | 214.3 KiB | 00m00s [ 71/179] Installing systemd-libs-0:258 100% | 389.5 MiB/s | 2.3 MiB | 00m00s [ 72/179] Installing libsemanage-0:3.9- 100% | 303.0 MiB/s | 310.2 KiB | 00m00s [ 73/179] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.6 KiB | 00m00s [ 74/179] Installing lua-libs-0:5.4.8-3 100% | 275.4 MiB/s | 282.0 KiB | 00m00s [ 75/179] Installing json-c-0:0.18-7.fc 100% | 0.0 B/s | 84.0 KiB | 00m00s [ 76/179] Installing libffi-0:3.5.2-1.f 100% | 83.2 MiB/s | 85.2 KiB | 00m00s [ 77/179] Installing p11-kit-0:0.25.8-1 100% | 134.7 MiB/s | 2.3 MiB | 00m00s [ 78/179] Installing alternatives-0:1.3 100% | 6.2 MiB/s | 63.8 KiB | 00m00s [ 79/179] Installing p11-kit-trust-0:0. 100% | 24.3 MiB/s | 448.3 KiB | 00m00s [ 80/179] Installing openssl-libs-1:3.5 100% | 445.5 MiB/s | 8.9 MiB | 00m00s [ 81/179] Installing coreutils-0:9.8-3. 100% | 188.8 MiB/s | 5.5 MiB | 00m00s [ 82/179] Installing ca-certificates-0: 100% | 2.4 MiB/s | 2.5 MiB | 00m01s [ 83/179] Installing gzip-0:1.14-1.fc44 100% | 30.3 MiB/s | 403.3 KiB | 00m00s [ 84/179] Installing rpm-sequoia-0:1.9. 100% | 413.1 MiB/s | 2.5 MiB | 00m00s [ 85/179] Installing libfsverity-0:1.6- 100% | 0.0 B/s | 29.5 KiB | 00m00s [ 86/179] Installing libevent-0:2.1.12- 100% | 288.7 MiB/s | 886.8 KiB | 00m00s [ 87/179] Installing util-linux-core-0: 100% | 92.5 MiB/s | 1.5 MiB | 00m00s [ 88/179] Installing libusb1-0:1.0.29-4 100% | 24.1 MiB/s | 172.9 KiB | 00m00s >>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Scriptlet output: >>> Creating group 'tss' with GID 59. >>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59. >>> [ 89/179] Installing tpm2-tss-0:4.1.3-8 100% | 314.4 MiB/s | 1.6 MiB | 00m00s [ 90/179] Installing ima-evm-utils-libs 100% | 0.0 B/s | 62.0 KiB | 00m00s [ 91/179] Installing gnupg2-gpg-agent-0 100% | 36.6 MiB/s | 675.4 KiB | 00m00s [ 92/179] Installing systemd-standalone 100% | 26.1 MiB/s | 294.1 KiB | 00m00s [ 93/179] Installing rpm-libs-0:6.0.0-1 100% | 304.5 MiB/s | 935.3 KiB | 00m00s [ 94/179] Installing zip-0:3.0-44.fc43. 100% | 56.8 MiB/s | 698.4 KiB | 00m00s [ 95/179] Installing gnupg2-keyboxd-0:2 100% | 39.6 MiB/s | 202.7 KiB | 00m00s [ 96/179] Installing libpsl-0:0.21.5-6. 100% | 75.7 MiB/s | 77.5 KiB | 00m00s [ 97/179] Installing tar-2:1.35-6.fc43. 100% | 164.3 MiB/s | 3.0 MiB | 00m00s [ 98/179] Installing linkdupes-0:0.7.2- 100% | 68.4 MiB/s | 840.1 KiB | 00m00s [ 99/179] Installing libselinux-utils-0 100% | 26.3 MiB/s | 323.4 KiB | 00m00s [100/179] Installing liblastlog2-0:2.41 100% | 7.0 MiB/s | 35.9 KiB | 00m00s [101/179] Installing libfdisk-0:2.41.2- 100% | 186.3 MiB/s | 381.6 KiB | 00m00s [102/179] Installing util-linux-0:2.41. 100% | 111.7 MiB/s | 3.6 MiB | 00m00s [103/179] Installing policycoreutils-0: 100% | 31.6 MiB/s | 711.8 KiB | 00m00s [104/179] Installing selinux-policy-0:4 100% | 1.8 MiB/s | 33.6 KiB | 00m00s [105/179] Installing selinux-policy-tar 100% | 219.2 MiB/s | 14.9 MiB | 00m00s [106/179] Installing zstd-0:1.5.7-3.fc4 100% | 35.6 MiB/s | 509.8 KiB | 00m00s [107/179] Installing libxml2-0:2.12.10- 100% | 113.6 MiB/s | 1.7 MiB | 00m00s [108/179] Installing nettle-0:3.10.1-2. 100% | 387.5 MiB/s | 793.7 KiB | 00m00s [109/179] Installing gnutls-0:3.8.10-5. 100% | 427.0 MiB/s | 3.8 MiB | 00m00s [110/179] Installing bzip2-0:1.0.8-21.f 100% | 8.9 MiB/s | 99.8 KiB | 00m00s [111/179] Installing add-determinism-0: 100% | 153.7 MiB/s | 2.3 MiB | 00m00s [112/179] Installing build-reproducibil 100% | 0.0 B/s | 1.5 KiB | 00m00s [113/179] Installing cpio-0:2.15-6.fc43 100% | 78.5 MiB/s | 1.1 MiB | 00m00s [114/179] Installing ed-0:1.22.2-1.fc44 100% | 13.3 MiB/s | 150.4 KiB | 00m00s [115/179] Installing patch-0:2.8-2.fc43 100% | 19.9 MiB/s | 224.3 KiB | 00m00s [116/179] Installing lz4-libs-0:1.10.0- 100% | 158.6 MiB/s | 162.5 KiB | 00m00s [117/179] Installing libarchive-0:3.8.2 100% | 311.6 MiB/s | 957.2 KiB | 00m00s [118/179] Installing libtool-ltdl-0:2.5 100% | 0.0 B/s | 71.2 KiB | 00m00s [119/179] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB | 00m00s [120/179] Installing cyrus-sasl-lib-0:2 100% | 143.7 MiB/s | 2.3 MiB | 00m00s [121/179] Installing openldap-0:2.6.10- 100% | 324.0 MiB/s | 663.6 KiB | 00m00s [122/179] Installing gnupg2-dirmngr-0:2 100% | 33.7 MiB/s | 621.1 KiB | 00m00s [123/179] Installing gnupg2-0:2.4.8-4.f 100% | 252.0 MiB/s | 6.6 MiB | 00m00s [124/179] Installing rpm-sign-libs-0:6. 100% | 0.0 B/s | 40.6 KiB | 00m00s [125/179] Installing gpgverify-0:2.2-3. 100% | 0.0 B/s | 9.4 KiB | 00m00s [126/179] Installing libgomp-0:15.2.1-4 100% | 530.2 MiB/s | 543.0 KiB | 00m00s [127/179] Installing jansson-0:2.14-3.f 100% | 88.3 MiB/s | 90.5 KiB | 00m00s [128/179] Installing libpkgconf-0:2.3.0 100% | 77.4 MiB/s | 79.2 KiB | 00m00s [129/179] Installing pkgconf-0:2.3.0-3. 100% | 8.1 MiB/s | 91.0 KiB | 00m00s [130/179] Installing pkgconf-pkg-config 100% | 177.3 KiB/s | 1.8 KiB | 00m00s [131/179] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [132/179] Installing libbrotli-0:1.1.0- 100% | 408.0 MiB/s | 835.6 KiB | 00m00s [133/179] Installing libnghttp2-0:1.68. 100% | 159.5 MiB/s | 163.4 KiB | 00m00s [134/179] Installing keyutils-libs-0:1. 100% | 0.0 B/s | 55.7 KiB | 00m00s [135/179] Installing libcom_err-0:1.47. 100% | 62.7 MiB/s | 64.2 KiB | 00m00s [136/179] Installing libverto-0:0.3.2-1 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [137/179] Installing krb5-libs-0:1.21.3 100% | 328.5 MiB/s | 2.3 MiB | 00m00s [138/179] Installing libssh-0:0.11.3-1. 100% | 277.9 MiB/s | 569.2 KiB | 00m00s [139/179] Installing libcurl-0:8.17.0-3 100% | 302.2 MiB/s | 928.4 KiB | 00m00s [140/179] Installing curl-0:8.17.0-3.fc 100% | 22.7 MiB/s | 464.7 KiB | 00m00s [141/179] Installing rpm-0:6.0.0-1.fc44 100% | 92.0 MiB/s | 2.6 MiB | 00m00s [142/179] Installing efi-srpm-macros-0: 100% | 0.0 B/s | 41.2 KiB | 00m00s [143/179] Installing java-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [144/179] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [145/179] Installing tree-sitter-srpm-m 100% | 0.0 B/s | 9.3 KiB | 00m00s [146/179] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [147/179] Installing filesystem-srpm-ma 100% | 0.0 B/s | 38.9 KiB | 00m00s [148/179] Installing elfutils-default-y 100% | 510.7 KiB/s | 2.0 KiB | 00m00s [149/179] Installing elfutils-libs-0:0. 100% | 336.6 MiB/s | 689.3 KiB | 00m00s [150/179] Installing elfutils-debuginfo 100% | 7.0 MiB/s | 86.3 KiB | 00m00s [151/179] Installing elfutils-0:0.194-1 100% | 172.4 MiB/s | 2.9 MiB | 00m00s [152/179] Installing binutils-0:2.45.50 100% | 366.0 MiB/s | 27.1 MiB | 00m00s [153/179] Installing gdb-minimal-0:16.3 100% | 315.6 MiB/s | 13.3 MiB | 00m00s [154/179] Installing debugedit-0:5.2-3. 100% | 17.7 MiB/s | 217.3 KiB | 00m00s [155/179] Installing rpm-build-libs-0:6 100% | 262.9 MiB/s | 269.2 KiB | 00m00s [156/179] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [157/179] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [158/179] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [159/179] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [160/179] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [161/179] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [162/179] Installing ocaml-srpm-macros- 100% | 0.0 B/s | 2.1 KiB | 00m00s [163/179] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [164/179] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [165/179] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [166/179] Installing gap-srpm-macros-0: 100% | 0.0 B/s | 2.7 KiB | 00m00s [167/179] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [168/179] Installing ansible-srpm-macro 100% | 0.0 B/s | 36.2 KiB | 00m00s [169/179] Installing redhat-rpm-config- 100% | 185.1 MiB/s | 189.5 KiB | 00m00s [170/179] Installing forge-srpm-macros- 100% | 0.0 B/s | 40.3 KiB | 00m00s [171/179] Installing fonts-srpm-macros- 100% | 0.0 B/s | 57.0 KiB | 00m00s [172/179] Installing go-srpm-macros-0:3 100% | 0.0 B/s | 63.0 KiB | 00m00s [173/179] Installing rpm-build-0:6.0.0- 100% | 18.1 MiB/s | 296.5 KiB | 00m00s [174/179] Installing pyproject-srpm-mac 100% | 0.0 B/s | 2.5 KiB | 00m00s [175/179] Installing python-srpm-macros 100% | 0.0 B/s | 52.9 KiB | 00m00s [176/179] Installing rpm-plugin-selinux 100% | 0.0 B/s | 13.0 KiB | 00m00s [177/179] Installing which-0:2.23-3.fc4 100% | 7.0 MiB/s | 85.7 KiB | 00m00s [178/179] Installing shadow-utils-2:4.1 100% | 158.8 MiB/s | 4.0 MiB | 00m00s [179/179] Installing info-0:7.2-6.fc43. 100% | 52.4 KiB/s | 354.3 KiB | 00m07s Warning: skipped OpenPGP checks for 4 packages from repository: copr_base Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.7.2-2.fc44.x86_64 alternatives-1.33-3.fc44.x86_64 ansible-srpm-macros-1-18.1.fc43.noarch audit-libs-4.1.2-2.fc44.x86_64 bash-5.3.0-2.fc43.x86_64 binutils-2.45.50-9.fc44.x86_64 build-reproducibility-srpm-macros-0.7.2-2.fc44.noarch bzip2-1.0.8-21.fc43.x86_64 bzip2-libs-1.0.8-21.fc43.x86_64 ca-certificates-2025.2.80_v9.0.304-2.fc44.noarch coreutils-9.8-3.fc44.x86_64 coreutils-common-9.8-3.fc44.x86_64 cpio-2.15-6.fc43.x86_64 crypto-policies-20250714-5.gitcd6043a.fc44.noarch curl-8.17.0-3.fc44.x86_64 cyrus-sasl-lib-2.1.28-33.fc44.x86_64 debugedit-5.2-3.fc44.x86_64 diffutils-3.12-3.fc43.x86_64 dwz-0.16-2.fc43.x86_64 ed-1.22.2-1.fc44.x86_64 efi-srpm-macros-6-5.fc44.noarch elfutils-0.194-1.fc44.x86_64 elfutils-debuginfod-client-0.194-1.fc44.x86_64 elfutils-default-yama-scope-0.194-1.fc44.noarch elfutils-libelf-0.194-1.fc44.x86_64 elfutils-libs-0.194-1.fc44.x86_64 fedora-gpg-keys-44-0.1.noarch fedora-release-44-0.5.noarch fedora-release-common-44-0.5.noarch fedora-release-identity-basic-44-0.5.noarch fedora-repos-44-0.1.noarch fedora-repos-rawhide-44-0.1.noarch file-5.46-8.fc44.x86_64 file-libs-5.46-8.fc44.x86_64 filesystem-3.18-50.fc43.x86_64 filesystem-srpm-macros-3.18-50.fc43.noarch findutils-4.10.0-6.fc43.x86_64 fonts-srpm-macros-5.0.0-1.fc44.noarch forge-srpm-macros-0.4.0-3.fc43.noarch fpc-srpm-macros-1.3-15.fc43.noarch gap-srpm-macros-2-1.fc44.noarch gawk-5.3.2-2.fc43.x86_64 gdb-minimal-16.3-6.fc44.x86_64 gdbm-libs-1.23-10.fc43.x86_64 ghc-srpm-macros-1.9.2-3.fc43.noarch glibc-2.42.9000-11.fc44.x86_64 glibc-common-2.42.9000-11.fc44.x86_64 glibc-gconv-extra-2.42.9000-11.fc44.x86_64 glibc-minimal-langpack-2.42.9000-11.fc44.x86_64 gmp-6.3.0-4.fc44.x86_64 gnat-srpm-macros-6-8.fc43.noarch gnulib-l10n-20241231-1.fc44.noarch gnupg2-2.4.8-4.fc43.x86_64 gnupg2-dirmngr-2.4.8-4.fc43.x86_64 gnupg2-gpg-agent-2.4.8-4.fc43.x86_64 gnupg2-gpgconf-2.4.8-4.fc43.x86_64 gnupg2-keyboxd-2.4.8-4.fc43.x86_64 gnupg2-verify-2.4.8-4.fc43.x86_64 gnutls-3.8.10-5.fc44.x86_64 go-srpm-macros-3.8.0-1.fc44.noarch gpg-pubkey-36f612dcf27f7d1a48a835e4dbfcf71c6d9f90a6-6786af3b gpg-pubkey-4f50a6114cd5c6976a7f1179655a4b02f577861e-6888bc98 gpg-pubkey-c6e7f081cf80e13146676e88829b606631645531-66b6dccf gpgverify-2.2-3.fc43.noarch grep-3.12-2.fc43.x86_64 gzip-1.14-1.fc44.x86_64 ima-evm-utils-libs-1.6.2-7.fc44.x86_64 info-7.2-6.fc43.x86_64 jansson-2.14-3.fc43.x86_64 java-srpm-macros-1-7.fc43.noarch json-c-0.18-7.fc43.x86_64 kernel-srpm-macros-1.0-27.fc43.noarch keyutils-libs-1.6.3-6.fc43.x86_64 krb5-libs-1.21.3-10.fc44.x86_64 libacl-2.3.2-4.fc43.x86_64 libarchive-3.8.2-1.fc44.x86_64 libassuan-2.5.7-4.fc43.x86_64 libattr-2.5.2-6.fc43.x86_64 libblkid-2.41.2-7.fc44.x86_64 libbrotli-1.1.0-10.fc44.x86_64 libcap-2.77-1.fc44.x86_64 libcap-ng-0.8.5-8.fc44.x86_64 libcom_err-1.47.3-3.fc44.x86_64 libcurl-8.17.0-3.fc44.x86_64 libeconf-0.7.9-2.fc43.x86_64 libevent-2.1.12-16.fc43.x86_64 libfdisk-2.41.2-7.fc44.x86_64 libffi-3.5.2-1.fc44.x86_64 libfsverity-1.6-3.fc43.x86_64 libgcc-15.2.1-4.fc44.x86_64 libgcrypt-1.11.2-1.fc44.x86_64 libgomp-15.2.1-4.fc44.x86_64 libgpg-error-1.56-1.fc44.x86_64 libidn2-2.3.8-2.fc43.x86_64 libksba-1.6.7-4.fc43.x86_64 liblastlog2-2.41.2-7.fc44.x86_64 libmount-2.41.2-7.fc44.x86_64 libnghttp2-1.68.0-1.fc44.x86_64 libpkgconf-2.3.0-3.fc43.x86_64 libpsl-0.21.5-6.fc43.x86_64 libselinux-3.9-5.fc44.x86_64 libselinux-utils-3.9-5.fc44.x86_64 libsemanage-3.9-4.fc44.x86_64 libsepol-3.9-2.fc43.x86_64 libsmartcols-2.41.2-7.fc44.x86_64 libssh-0.11.3-1.fc44.x86_64 libssh-config-0.11.3-1.fc44.noarch libstdc++-15.2.1-4.fc44.x86_64 libtasn1-4.20.0-2.fc43.x86_64 libtool-ltdl-2.5.4-7.fc43.x86_64 libunistring-1.1-10.fc43.x86_64 libusb1-1.0.29-4.fc44.x86_64 libuuid-2.41.2-7.fc44.x86_64 libverto-0.3.2-11.fc43.x86_64 libxcrypt-4.5.2-1.fc44.x86_64 libxml2-2.12.10-5.fc44.x86_64 libzstd-1.5.7-3.fc44.x86_64 linkdupes-0.7.2-2.fc44.x86_64 lua-libs-5.4.8-3.fc44.x86_64 lua-srpm-macros-1-16.fc43.noarch lz4-libs-1.10.0-3.fc43.x86_64 mpfr-4.2.2-2.fc43.x86_64 ncurses-base-6.5-7.20250614.fc43.noarch ncurses-libs-6.5-7.20250614.fc43.x86_64 nettle-3.10.1-2.fc43.x86_64 npth-1.8-3.fc43.x86_64 ocaml-srpm-macros-11-2.fc43.noarch openblas-srpm-macros-2-20.fc43.noarch openldap-2.6.10-4.fc44.x86_64 openssl-libs-3.5.4-1.fc44.x86_64 p11-kit-0.25.8-1.fc44.x86_64 p11-kit-trust-0.25.8-1.fc44.x86_64 package-notes-srpm-macros-0.5-14.fc43.noarch pam-libs-1.7.1-3.fc43.x86_64 patch-2.8-2.fc43.x86_64 pcre2-10.47-1.fc44.x86_64 pcre2-syntax-10.47-1.fc44.noarch perl-srpm-macros-1-60.fc43.noarch pkgconf-2.3.0-3.fc43.x86_64 pkgconf-m4-2.3.0-3.fc43.noarch pkgconf-pkg-config-2.3.0-3.fc43.x86_64 policycoreutils-3.9-5.fc44.x86_64 popt-1.19-9.fc43.x86_64 publicsuffix-list-dafsa-20250616-2.fc43.noarch pyproject-srpm-macros-1.18.5-1.fc44.noarch python-srpm-macros-3.14-9.fc44.noarch qt5-srpm-macros-5.15.18-1.fc44.noarch qt6-srpm-macros-6.10.0-1.fc44.noarch readline-8.3-2.fc43.x86_64 redhat-rpm-config-343-14.fc44.noarch rpm-6.0.0-1.fc44.x86_64 rpm-build-6.0.0-1.fc44.x86_64 rpm-build-libs-6.0.0-1.fc44.x86_64 rpm-libs-6.0.0-1.fc44.x86_64 rpm-plugin-selinux-6.0.0-1.fc44.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 rpm-sign-libs-6.0.0-1.fc44.x86_64 rust-srpm-macros-26.4-1.fc44.noarch sed-4.9-6.fc44.x86_64 selinux-policy-42.15-1.fc44.noarch selinux-policy-targeted-42.15-1.fc44.noarch setup-2.15.0-27.fc44.noarch shadow-utils-4.18.0-3.fc43.x86_64 sqlite-libs-3.51.0-1.fc44.x86_64 systemd-libs-258.2-1.fc44.x86_64 systemd-standalone-sysusers-258.2-1.fc44.x86_64 tar-1.35-6.fc43.x86_64 tpm2-tss-4.1.3-8.fc43.x86_64 tree-sitter-srpm-macros-0.4.2-1.fc43.noarch unzip-6.0-68.fc44.x86_64 util-linux-2.41.2-7.fc44.x86_64 util-linux-core-2.41.2-7.fc44.x86_64 which-2.23-3.fc43.x86_64 xxhash-libs-0.8.3-3.fc43.x86_64 xz-5.8.1-2.fc43.x86_64 xz-libs-5.8.1-2.fc43.x86_64 zig-srpm-macros-1-5.fc43.noarch zip-3.0-44.fc43.x86_64 zlib-ng-compat-2.2.5-2.fc44.x86_64 zstd-1.5.7-3.fc44.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Wrote: /builddir/build/SRPMS/composable_kernel-7.1.0-2.fc44.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-rawhide-x86_64-1763473467.053604/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-xidpuc_4/composable_kernel/composable_kernel.spec) Config(child) 0 minutes 28 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/composable_kernel-7.1.0-2.fc44.src.rpm) Config(fedora-rawhide-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1763473467.053604/root. INFO: reusing tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1763473467.053604/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-1763473467.053604/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc44.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.3.0.0-2.fc44.x86_64 dnf5-plugins-5.3.0.0-2.fc44.x86_64 Finish: chroot init Start: build phase for composable_kernel-7.1.0-2.fc44.src.rpm Start: build setup for composable_kernel-7.1.0-2.fc44.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Wrote: /builddir/build/SRPMS/composable_kernel-7.1.0-2.fc44.src.rpm Updating and loading repositories: Copr repository 100% | 76.6 KiB/s | 1.5 KiB | 00m00s fedora 100% | 351.0 KiB/s | 30.2 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 0:3.31.6-4.fc43 fedora 34.5 MiB fdupes x86_64 1:2.4.0-2.fc43 fedora 118.1 KiB gcc-c++ x86_64 0:15.2.1-4.fc44 copr_base 41.4 MiB git x86_64 0:2.51.1-1.fc44 fedora 56.4 KiB ninja-build x86_64 0:1.13.1-4.fc44 fedora 480.7 KiB rocm-cmake noarch 0:7.1.0-1.fc44 copr_base 129.5 KiB rocm-comgr-devel x86_64 0:20-7.rocm7.1.0.fc44 copr_base 100.5 KiB rocm-compilersupport-macros noarch 0:20-7.rocm7.1.0.fc44 copr_base 160.0 B rocm-hip-devel x86_64 0:7.1.0-1.fc44 copr_base 2.4 MiB rocm-rpm-macros noarch 0:7.1.0-2.fc44 fedora 18.9 KiB rocm-runtime-devel x86_64 0:7.1.0-1.fc44 copr_base 683.4 KiB Installing dependencies: annobin-docs noarch 0:13.03-1.fc44 fedora 99.2 KiB annobin-plugin-gcc x86_64 0:13.03-1.fc44 fedora 695.8 KiB cmake-data noarch 0:3.31.6-4.fc43 fedora 8.5 MiB cmake-filesystem x86_64 0:3.31.6-4.fc43 fedora 0.0 B cmake-rpm-macros noarch 0:3.31.6-4.fc43 fedora 7.7 KiB cpp x86_64 0:15.2.1-4.fc44 copr_base 37.9 MiB emacs-filesystem noarch 1:30.0-5.fc43 fedora 0.0 B environment-modules x86_64 0:5.6.0-1.fc43 fedora 1.9 MiB expat x86_64 0:2.7.2-1.fc44 fedora 298.6 KiB gcc x86_64 0:15.2.1-4.fc44 copr_base 111.9 MiB gcc-plugin-annobin x86_64 0:15.2.1-4.fc44 copr_base 57.1 KiB git-core x86_64 0:2.51.1-1.fc44 fedora 23.6 MiB git-core-doc noarch 0:2.51.1-1.fc44 fedora 17.7 MiB glibc-devel x86_64 0:2.42.9000-11.fc44 fedora 2.3 MiB groff-base x86_64 0:1.23.0-11.fc44 fedora 3.8 MiB hipcc x86_64 0:20-7.rocm7.1.0.fc44 copr_base 634.5 KiB hwdata noarch 0:0.401-1.fc44 fedora 9.6 MiB jsoncpp x86_64 0:1.9.6-2.fc43 fedora 257.6 KiB kernel-headers x86_64 0:6.18.0-0.rc6.51.fc44 fedora 6.8 MiB less x86_64 0:685-5.fc44 fedora 413.4 KiB libcbor x86_64 0:0.13.0-1.fc44 fedora 79.4 KiB libdrm x86_64 0:2.4.128-3.fc44 fedora 399.9 KiB libedit x86_64 0:3.1-57.20251016cvs.fc44 fedora 240.2 KiB libfido2 x86_64 0:1.16.0-4.fc44 fedora 238.5 KiB libmpc x86_64 0:1.3.1-8.fc43 fedora 160.6 KiB libpciaccess x86_64 0:0.16-16.fc43 fedora 44.5 KiB libpipeline x86_64 0:1.5.8-3.fc43 fedora 145.1 KiB libstdc++-devel x86_64 0:15.2.1-4.fc44 copr_base 37.2 MiB libtommath x86_64 0:1.3.1~rc1-6.fc43 fedora 126.4 KiB libuv x86_64 1:1.51.0-2.fc43 fedora 570.2 KiB libxcrypt-devel x86_64 0:4.5.2-1.fc44 fedora 31.0 KiB make x86_64 1:4.4.1-11.fc43 fedora 1.8 MiB man-db x86_64 0:2.13.1-2.fc43 fedora 2.9 MiB mpdecimal x86_64 0:4.0.1-2.fc43 fedora 217.2 KiB ncurses x86_64 0:6.5-7.20250614.fc43 fedora 609.8 KiB numactl-libs x86_64 0:2.0.19-3.fc43 fedora 56.9 KiB openssh x86_64 0:10.0p1-8.fc44 fedora 1.4 MiB openssh-clients x86_64 0:10.0p1-8.fc44 fedora 2.6 MiB pcre2-utf32 x86_64 0:10.47-1.fc44 fedora 611.1 KiB perl-AutoLoader noarch 0:5.74-520.fc43 fedora 20.6 KiB perl-B x86_64 0:1.89-520.fc43 fedora 501.3 KiB perl-Carp noarch 0:1.54-520.fc43 fedora 46.6 KiB perl-Class-Struct noarch 0:0.68-520.fc43 fedora 25.4 KiB perl-Data-Dumper x86_64 0:2.191-521.fc43 fedora 115.6 KiB perl-Digest noarch 0:1.20-520.fc43 fedora 35.3 KiB perl-Digest-MD5 x86_64 0:2.59-520.fc43 fedora 59.7 KiB perl-DynaLoader x86_64 0:1.57-520.fc43 fedora 32.1 KiB perl-Encode x86_64 4:3.21-520.fc43 fedora 4.7 MiB perl-Errno x86_64 0:1.38-520.fc43 fedora 8.4 KiB perl-Error noarch 1:0.17030-2.fc43 fedora 76.7 KiB perl-Exporter noarch 0:5.79-520.fc43 fedora 54.3 KiB perl-Fcntl x86_64 0:1.20-520.fc43 fedora 48.8 KiB perl-File-Basename noarch 0:2.86-520.fc43 fedora 14.0 KiB perl-File-Copy noarch 0:2.41-520.fc43 fedora 19.7 KiB perl-File-Path noarch 0:2.18-520.fc43 fedora 63.5 KiB perl-File-Temp noarch 1:0.231.200-1.fc44 fedora 163.7 KiB perl-File-Which noarch 0:1.27-14.fc43 fedora 30.4 KiB perl-File-stat noarch 0:1.14-520.fc43 fedora 12.5 KiB perl-FileHandle noarch 0:2.05-520.fc43 fedora 9.4 KiB perl-Getopt-Long noarch 1:2.58-520.fc43 fedora 144.5 KiB perl-Getopt-Std noarch 0:1.14-520.fc43 fedora 11.2 KiB perl-Git noarch 0:2.51.1-1.fc44 fedora 64.4 KiB perl-HTTP-Tiny noarch 0:0.090-521.fc43 fedora 154.4 KiB perl-IO x86_64 0:1.55-520.fc43 fedora 147.4 KiB perl-IO-Socket-IP noarch 0:0.43-521.fc43 fedora 100.3 KiB perl-IO-Socket-SSL noarch 0:2.095-2.fc43 fedora 714.5 KiB perl-IPC-Open3 noarch 0:1.24-520.fc43 fedora 27.7 KiB perl-MIME-Base32 noarch 0:1.303-24.fc43 fedora 30.7 KiB perl-MIME-Base64 x86_64 0:3.16-520.fc43 fedora 42.0 KiB perl-Net-SSLeay x86_64 0:1.94-11.fc43 fedora 1.3 MiB perl-POSIX x86_64 0:2.23-520.fc43 fedora 231.4 KiB perl-PathTools x86_64 0:3.94-520.fc43 fedora 180.0 KiB perl-Pod-Escapes noarch 1:1.07-520.fc43 fedora 24.9 KiB perl-Pod-Perldoc noarch 0:3.28.01-521.fc43 fedora 163.7 KiB perl-Pod-Simple noarch 1:3.47-3.fc43 fedora 565.3 KiB perl-Pod-Usage noarch 4:2.05-520.fc43 fedora 86.3 KiB perl-Scalar-List-Utils x86_64 5:1.70-1.fc43 fedora 144.9 KiB perl-SelectSaver noarch 0:1.02-520.fc43 fedora 2.2 KiB perl-Socket x86_64 4:2.040-2.fc43 fedora 120.3 KiB perl-Storable x86_64 1:3.37-521.fc43 fedora 231.2 KiB perl-Symbol noarch 0:1.09-520.fc43 fedora 6.8 KiB perl-Term-ANSIColor noarch 0:5.01-521.fc43 fedora 97.5 KiB perl-Term-Cap noarch 0:1.18-520.fc43 fedora 29.3 KiB perl-TermReadKey x86_64 0:2.38-26.fc43 fedora 64.0 KiB perl-Text-ParseWords noarch 0:3.31-520.fc43 fedora 13.6 KiB perl-Text-Tabs+Wrap noarch 0:2024.001-520.fc43 fedora 22.6 KiB perl-Time-Local noarch 2:1.350-520.fc43 fedora 69.0 KiB perl-URI noarch 0:5.34-2.fc44 fedora 268.0 KiB perl-base noarch 0:2.27-520.fc43 fedora 12.6 KiB perl-constant noarch 0:1.33-521.fc43 fedora 26.2 KiB perl-if noarch 0:0.61.000-520.fc43 fedora 5.8 KiB perl-interpreter x86_64 4:5.42.0-520.fc43 fedora 118.6 KiB perl-lib x86_64 0:0.65-520.fc43 fedora 8.5 KiB perl-libnet noarch 0:3.15-521.fc43 fedora 289.4 KiB perl-libs x86_64 4:5.42.0-520.fc43 fedora 11.5 MiB perl-locale noarch 0:1.13-520.fc43 fedora 6.1 KiB perl-mro x86_64 0:1.29-520.fc43 fedora 41.6 KiB perl-overload noarch 0:1.40-520.fc43 fedora 71.6 KiB perl-overloading noarch 0:0.02-520.fc43 fedora 4.9 KiB perl-parent noarch 1:0.244-520.fc43 fedora 10.3 KiB perl-podlators noarch 1:6.0.2-520.fc43 fedora 317.5 KiB perl-vars noarch 0:1.05-520.fc43 fedora 3.9 KiB procps-ng x86_64 0:4.0.4-9.fc44 fedora 1.0 MiB python-pip-wheel noarch 0:25.2-4.fc44 fedora 1.2 MiB python3 x86_64 0:3.14.0-2.fc44 fedora 28.9 KiB python3-libs x86_64 0:3.14.0-2.fc44 fedora 43.0 MiB rhash x86_64 0:1.4.5-3.fc43 fedora 351.1 KiB rocm-clang x86_64 0:20-7.rocm7.1.0.fc44 copr_base 68.5 MiB rocm-clang-devel x86_64 0:20-7.rocm7.1.0.fc44 copr_base 26.1 MiB rocm-clang-libs x86_64 0:20-7.rocm7.1.0.fc44 copr_base 94.1 MiB rocm-clang-runtime-devel x86_64 0:20-7.rocm7.1.0.fc44 copr_base 8.4 MiB rocm-comgr x86_64 0:20-7.rocm7.1.0.fc44 copr_base 126.3 MiB rocm-device-libs x86_64 0:20-7.rocm7.1.0.fc44 copr_base 3.2 MiB rocm-hip x86_64 0:7.1.0-1.fc44 copr_base 27.0 MiB rocm-libc++ x86_64 0:20-7.rocm7.1.0.fc44 copr_base 1.3 MiB rocm-libc++-devel x86_64 0:20-7.rocm7.1.0.fc44 copr_base 15.0 MiB rocm-lld x86_64 0:20-7.rocm7.1.0.fc44 copr_base 5.9 MiB rocm-llvm x86_64 0:20-7.rocm7.1.0.fc44 copr_base 52.5 MiB rocm-llvm-devel x86_64 0:20-7.rocm7.1.0.fc44 copr_base 28.3 MiB rocm-llvm-filesystem x86_64 0:20-7.rocm7.1.0.fc44 copr_base 0.0 B rocm-llvm-libs x86_64 0:20-7.rocm7.1.0.fc44 copr_base 91.6 MiB rocm-llvm-static x86_64 0:20-7.rocm7.1.0.fc44 copr_base 1.9 GiB rocm-runtime x86_64 0:7.1.0-1.fc44 copr_base 3.2 MiB tcl x86_64 1:9.0.2-1.fc44 fedora 4.3 MiB tzdata noarch 0:2025b-3.fc43 fedora 1.6 MiB vim-filesystem noarch 2:9.1.1914-1.fc44 fedora 40.0 B zlib-ng-compat-devel x86_64 0:2.2.5-2.fc44 fedora 107.0 KiB Transaction Summary: Installing: 138 packages Total size of inbound packages is 537 MiB. Need to download 537 MiB. After this operation, 3 GiB extra will be used (install 3 GiB, remove 0 B). [ 1/138] git-0:2.51.1-1.fc44.x86_64 100% | 318.9 KiB/s | 41.1 KiB | 00m00s [ 2/138] fdupes-1:2.4.0-2.fc43.x86_64 100% | 418.5 KiB/s | 59.0 KiB | 00m00s [ 3/138] ninja-build-0:1.13.1-4.fc44.x 100% | 1.0 MiB/s | 198.0 KiB | 00m00s [ 4/138] rocm-cmake-0:7.1.0-1.fc44.noa 100% | 12.4 MiB/s | 38.2 KiB | 00m00s [ 5/138] gcc-c++-0:15.2.1-4.fc44.x86_6 100% | 146.8 MiB/s | 15.3 MiB | 00m00s [ 6/138] rocm-comgr-devel-0:20-7.rocm7 100% | 664.1 KiB/s | 33.2 KiB | 00m00s [ 7/138] rocm-compilersupport-macros-0 100% | 988.8 KiB/s | 15.8 KiB | 00m00s [ 8/138] rocm-hip-devel-0:7.1.0-1.fc44 100% | 9.5 MiB/s | 263.3 KiB | 00m00s [ 9/138] rocm-rpm-macros-0:7.1.0-2.fc4 100% | 582.9 KiB/s | 16.3 KiB | 00m00s [ 10/138] rocm-runtime-devel-0:7.1.0-1. 100% | 2.7 MiB/s | 117.2 KiB | 00m00s [ 11/138] pcre2-utf32-0:10.47-1.fc44.x8 100% | 2.9 MiB/s | 232.9 KiB | 00m00s [ 12/138] cmake-0:3.31.6-4.fc43.x86_64 100% | 32.2 MiB/s | 12.2 MiB | 00m00s [ 13/138] perl-File-Basename-0:2.86-520 100% | 660.3 KiB/s | 17.2 KiB | 00m00s [ 14/138] git-core-0:2.51.1-1.fc44.x86_ 100% | 21.3 MiB/s | 5.0 MiB | 00m00s [ 15/138] perl-Getopt-Long-1:2.58-520.f 100% | 2.5 MiB/s | 63.6 KiB | 00m00s [ 16/138] git-core-doc-0:2.51.1-1.fc44. 100% | 15.4 MiB/s | 3.0 MiB | 00m00s [ 17/138] perl-Git-0:2.51.1-1.fc44.noar 100% | 1.4 MiB/s | 38.2 KiB | 00m00s [ 18/138] perl-IPC-Open3-0:1.24-520.fc4 100% | 920.4 KiB/s | 23.9 KiB | 00m00s [ 19/138] perl-PathTools-0:3.94-520.fc4 100% | 3.4 MiB/s | 87.2 KiB | 00m00s [ 20/138] perl-TermReadKey-0:2.38-26.fc 100% | 1.3 MiB/s | 35.2 KiB | 00m00s [ 21/138] perl-interpreter-4:5.42.0-520 100% | 2.5 MiB/s | 72.4 KiB | 00m00s [ 22/138] perl-lib-0:0.65-520.fc43.x86_ 100% | 623.0 KiB/s | 15.0 KiB | 00m00s [ 23/138] vim-filesystem-2:9.1.1914-1.f 100% | 618.2 KiB/s | 15.5 KiB | 00m00s [ 24/138] cmake-filesystem-0:3.31.6-4.f 100% | 619.6 KiB/s | 15.5 KiB | 00m00s [ 25/138] cmake-data-0:3.31.6-4.fc43.no 100% | 63.3 MiB/s | 2.5 MiB | 00m00s [ 26/138] expat-0:2.7.2-1.fc44.x86_64 100% | 4.5 MiB/s | 119.0 KiB | 00m00s [ 27/138] jsoncpp-0:1.9.6-2.fc43.x86_64 100% | 3.9 MiB/s | 101.1 KiB | 00m00s [ 28/138] libuv-1:1.51.0-2.fc43.x86_64 100% | 9.6 MiB/s | 266.1 KiB | 00m00s [ 29/138] make-1:4.4.1-11.fc43.x86_64 100% | 19.0 MiB/s | 585.2 KiB | 00m00s [ 30/138] rhash-0:1.4.5-3.fc43.x86_64 100% | 7.2 MiB/s | 197.9 KiB | 00m00s [ 31/138] libmpc-0:1.3.1-8.fc43.x86_64 100% | 2.5 MiB/s | 70.4 KiB | 00m00s [ 32/138] perl-File-Copy-0:2.41-520.fc4 100% | 805.1 KiB/s | 20.1 KiB | 00m00s [ 33/138] perl-File-Which-0:1.27-14.fc4 100% | 855.8 KiB/s | 21.4 KiB | 00m00s [ 34/138] perl-Getopt-Std-0:1.14-520.fc 100% | 628.2 KiB/s | 15.7 KiB | 00m00s [ 35/138] perl-Scalar-List-Utils-5:1.70 100% | 2.8 MiB/s | 75.0 KiB | 00m00s [ 36/138] perl-URI-0:5.34-2.fc44.noarch 100% | 5.6 MiB/s | 149.4 KiB | 00m00s [ 37/138] environment-modules-0:5.6.0-1 100% | 25.9 MiB/s | 795.3 KiB | 00m00s [ 38/138] less-0:685-5.fc44.x86_64 100% | 6.5 MiB/s | 199.5 KiB | 00m00s [ 39/138] openssh-clients-0:10.0p1-8.fc 100% | 22.9 MiB/s | 749.5 KiB | 00m00s [ 40/138] perl-Carp-0:1.54-520.fc43.noa 100% | 1.1 MiB/s | 28.7 KiB | 00m00s [ 41/138] perl-Exporter-0:5.79-520.fc43 100% | 1.2 MiB/s | 30.9 KiB | 00m00s [ 42/138] perl-Pod-Usage-4:2.05-520.fc4 100% | 1.6 MiB/s | 40.5 KiB | 00m00s [ 43/138] perl-Text-ParseWords-0:3.31-5 100% | 628.7 KiB/s | 16.3 KiB | 00m00s [ 44/138] perl-base-0:2.27-520.fc43.noa 100% | 624.0 KiB/s | 16.2 KiB | 00m00s [ 45/138] perl-constant-0:1.33-521.fc43 100% | 910.8 KiB/s | 22.8 KiB | 00m00s [ 46/138] perl-overload-0:1.40-520.fc43 100% | 1.7 MiB/s | 45.6 KiB | 00m00s [ 47/138] perl-Error-1:0.17030-2.fc43.n 100% | 1.6 MiB/s | 40.2 KiB | 00m00s [ 48/138] perl-Fcntl-0:1.20-520.fc43.x8 100% | 1.2 MiB/s | 29.8 KiB | 00m00s [ 49/138] perl-IO-0:1.55-520.fc43.x86_6 100% | 3.1 MiB/s | 82.2 KiB | 00m00s [ 50/138] perl-POSIX-0:2.23-520.fc43.x8 100% | 3.7 MiB/s | 97.8 KiB | 00m00s [ 51/138] perl-Symbol-0:1.09-520.fc43.n 100% | 568.1 KiB/s | 14.2 KiB | 00m00s [ 52/138] perl-Errno-0:1.38-520.fc43.x8 100% | 597.7 KiB/s | 14.9 KiB | 00m00s [ 53/138] perl-DynaLoader-0:1.57-520.fc 100% | 839.2 KiB/s | 26.0 KiB | 00m00s [ 54/138] perl-vars-0:1.05-520.fc43.noa 100% | 519.5 KiB/s | 13.0 KiB | 00m00s [ 55/138] perl-libs-4:5.42.0-520.fc43.x 100% | 44.9 MiB/s | 2.6 MiB | 00m00s [ 56/138] emacs-filesystem-1:30.0-5.fc4 100% | 312.0 KiB/s | 7.5 KiB | 00m00s [ 57/138] perl-Data-Dumper-0:2.191-521. 100% | 2.1 MiB/s | 56.3 KiB | 00m00s [ 58/138] perl-MIME-Base32-0:1.303-24.f 100% | 814.1 KiB/s | 20.4 KiB | 00m00s [ 59/138] perl-MIME-Base64-0:3.16-520.f 100% | 1.2 MiB/s | 29.7 KiB | 00m00s [ 60/138] perl-libnet-0:3.15-521.fc43.n 100% | 4.8 MiB/s | 128.3 KiB | 00m00s [ 61/138] perl-parent-1:0.244-520.fc43. 100% | 569.4 KiB/s | 14.8 KiB | 00m00s [ 62/138] man-db-0:2.13.1-2.fc43.x86_64 100% | 35.8 MiB/s | 1.4 MiB | 00m00s [ 63/138] libedit-0:3.1-57.20251016cvs. 100% | 3.9 MiB/s | 105.0 KiB | 00m00s [ 64/138] libfido2-0:1.16.0-4.fc44.x86_ 100% | 3.7 MiB/s | 98.5 KiB | 00m00s [ 65/138] openssh-0:10.0p1-8.fc44.x86_6 100% | 11.8 MiB/s | 338.7 KiB | 00m00s [ 66/138] perl-Pod-Perldoc-0:3.28.01-52 100% | 3.2 MiB/s | 84.3 KiB | 00m00s [ 67/138] perl-podlators-1:6.0.2-520.fc 100% | 4.8 MiB/s | 128.4 KiB | 00m00s [ 68/138] perl-mro-0:1.29-520.fc43.x86_ 100% | 1.2 MiB/s | 29.9 KiB | 00m00s [ 69/138] perl-overloading-0:0.02-520.f 100% | 516.4 KiB/s | 12.9 KiB | 00m00s [ 70/138] perl-File-stat-0:1.14-520.fc4 100% | 682.5 KiB/s | 17.1 KiB | 00m00s [ 71/138] perl-SelectSaver-0:1.02-520.f 100% | 468.9 KiB/s | 11.7 KiB | 00m00s [ 72/138] perl-Socket-4:2.040-2.fc43.x8 100% | 2.1 MiB/s | 54.9 KiB | 00m00s [ 73/138] perl-locale-0:1.13-520.fc43.n 100% | 540.2 KiB/s | 13.5 KiB | 00m00s [ 74/138] perl-B-0:1.89-520.fc43.x86_64 100% | 6.7 MiB/s | 177.7 KiB | 00m00s [ 75/138] perl-Digest-MD5-0:2.59-520.fc 100% | 1.3 MiB/s | 35.8 KiB | 00m00s [ 76/138] perl-FileHandle-0:2.05-520.fc 100% | 620.0 KiB/s | 15.5 KiB | 00m00s [ 77/138] perl-IO-Socket-IP-0:0.43-521. 100% | 1.6 MiB/s | 42.1 KiB | 00m00s [ 78/138] perl-Time-Local-2:1.350-520.f 100% | 1.3 MiB/s | 34.4 KiB | 00m00s [ 79/138] groff-base-0:1.23.0-11.fc44.x 100% | 34.4 MiB/s | 1.1 MiB | 00m00s [ 80/138] libpipeline-0:1.5.8-3.fc43.x8 100% | 2.3 MiB/s | 59.9 KiB | 00m00s [ 81/138] libcbor-0:0.13.0-1.fc44.x86_6 100% | 1.3 MiB/s | 34.5 KiB | 00m00s [ 82/138] perl-File-Temp-1:0.231.200-1. 100% | 2.2 MiB/s | 59.5 KiB | 00m00s [ 83/138] perl-HTTP-Tiny-0:0.090-521.fc 100% | 2.2 MiB/s | 56.3 KiB | 00m00s [ 84/138] perl-Pod-Simple-1:3.47-3.fc43 100% | 8.3 MiB/s | 219.9 KiB | 00m00s [ 85/138] perl-Term-ANSIColor-0:5.01-52 100% | 1.8 MiB/s | 47.6 KiB | 00m00s [ 86/138] perl-Term-Cap-0:1.18-520.fc43 100% | 877.3 KiB/s | 21.9 KiB | 00m00s [ 87/138] perl-Class-Struct-0:0.68-520. 100% | 883.0 KiB/s | 22.1 KiB | 00m00s [ 88/138] perl-if-0:0.61.000-520.fc43.n 100% | 538.6 KiB/s | 14.0 KiB | 00m00s [ 89/138] perl-Digest-0:1.20-520.fc43.n 100% | 991.5 KiB/s | 24.8 KiB | 00m00s [ 90/138] perl-File-Path-0:2.18-520.fc4 100% | 1.4 MiB/s | 35.1 KiB | 00m00s [ 91/138] perl-IO-Socket-SSL-0:2.095-2. 100% | 8.4 MiB/s | 231.5 KiB | 00m00s [ 92/138] perl-Net-SSLeay-0:1.94-11.fc4 100% | 13.1 MiB/s | 374.8 KiB | 00m00s [ 93/138] perl-Pod-Escapes-1:1.07-520.f 100% | 791.2 KiB/s | 19.8 KiB | 00m00s [ 94/138] perl-Text-Tabs+Wrap-0:2024.00 100% | 832.1 KiB/s | 21.6 KiB | 00m00s [ 95/138] perl-AutoLoader-0:5.74-520.fc 100% | 817.1 KiB/s | 21.2 KiB | 00m00s [ 96/138] ncurses-0:6.5-7.20250614.fc43 100% | 14.4 MiB/s | 426.2 KiB | 00m00s [ 97/138] perl-Encode-4:3.21-520.fc43.x 100% | 31.9 MiB/s | 1.1 MiB | 00m00s [ 98/138] perl-Storable-1:3.37-521.fc43 100% | 3.7 MiB/s | 98.5 KiB | 00m00s [ 99/138] python3-0:3.14.0-2.fc44.x86_6 100% | 1.1 MiB/s | 27.7 KiB | 00m00s [100/138] mpdecimal-0:4.0.1-2.fc43.x86_ 100% | 3.8 MiB/s | 97.1 KiB | 00m00s [101/138] tzdata-0:2025b-3.fc43.noarch 100% | 22.5 MiB/s | 713.9 KiB | 00m00s [102/138] python-pip-wheel-0:25.2-4.fc4 100% | 18.9 MiB/s | 1.1 MiB | 00m00s [103/138] python3-libs-0:3.14.0-2.fc44. 100% | 109.1 MiB/s | 9.8 MiB | 00m00s [104/138] procps-ng-0:4.0.4-9.fc44.x86_ 100% | 11.5 MiB/s | 364.5 KiB | 00m00s [105/138] rocm-runtime-0:7.1.0-1.fc44.x 100% | 89.5 MiB/s | 641.5 KiB | 00m00s [106/138] tcl-1:9.0.2-1.fc44.x86_64 100% | 33.3 MiB/s | 1.2 MiB | 00m00s [107/138] libtommath-0:1.3.1~rc1-6.fc43 100% | 2.4 MiB/s | 64.3 KiB | 00m00s [108/138] libdrm-0:2.4.128-3.fc44.x86_6 100% | 5.9 MiB/s | 162.0 KiB | 00m00s [109/138] numactl-libs-0:2.0.19-3.fc43. 100% | 1.2 MiB/s | 31.1 KiB | 00m00s [110/138] libpciaccess-0:0.16-16.fc43.x 100% | 1.0 MiB/s | 26.2 KiB | 00m00s [111/138] hipcc-0:20-7.rocm7.1.0.fc44.x 100% | 7.2 MiB/s | 133.2 KiB | 00m00s [112/138] hwdata-0:0.401-1.fc44.noarch 100% | 40.5 MiB/s | 1.7 MiB | 00m00s [113/138] rocm-hip-0:7.1.0-1.fc44.x86_6 100% | 176.5 MiB/s | 10.2 MiB | 00m00s [114/138] rocm-device-libs-0:20-7.rocm7 100% | 7.1 MiB/s | 496.6 KiB | 00m00s [115/138] libstdc++-devel-0:15.2.1-4.fc 100% | 67.9 MiB/s | 5.2 MiB | 00m00s [116/138] cpp-0:15.2.1-4.fc44.x86_64 100% | 177.0 MiB/s | 12.9 MiB | 00m00s [117/138] gcc-0:15.2.1-4.fc44.x86_64 100% | 153.6 MiB/s | 39.6 MiB | 00m00s [118/138] glibc-devel-0:2.42.9000-11.fc 100% | 10.3 MiB/s | 591.8 KiB | 00m00s [119/138] libxcrypt-devel-0:4.5.2-1.fc4 100% | 1.1 MiB/s | 30.0 KiB | 00m00s [120/138] kernel-headers-0:6.18.0-0.rc6 100% | 42.8 MiB/s | 1.7 MiB | 00m00s [121/138] rocm-clang-devel-0:20-7.rocm7 100% | 120.8 MiB/s | 2.5 MiB | 00m00s [122/138] rocm-lld-0:20-7.rocm7.1.0.fc4 100% | 15.8 MiB/s | 1.6 MiB | 00m00s [123/138] rocm-clang-0:20-7.rocm7.1.0.f 100% | 37.3 MiB/s | 15.9 MiB | 00m00s [124/138] rocm-comgr-0:20-7.rocm7.1.0.f 100% | 32.9 MiB/s | 31.3 MiB | 00m01s [125/138] rocm-clang-runtime-devel-0:20 100% | 5.8 MiB/s | 637.9 KiB | 00m00s [126/138] rocm-libc++-devel-0:20-7.rocm 100% | 13.1 MiB/s | 1.2 MiB | 00m00s [127/138] rocm-llvm-static-0:20-7.rocm7 100% | 234.1 MiB/s | 282.1 MiB | 00m01s [128/138] rocm-llvm-devel-0:20-7.rocm7. 100% | 21.1 MiB/s | 4.0 MiB | 00m00s [129/138] rocm-clang-libs-0:20-7.rocm7. 100% | 25.4 MiB/s | 23.0 MiB | 00m01s [130/138] rocm-llvm-filesystem-0:20-7.r 100% | 734.4 KiB/s | 25.7 KiB | 00m00s [131/138] rocm-libc++-0:20-7.rocm7.1.0. 100% | 5.4 MiB/s | 373.9 KiB | 00m00s [132/138] zlib-ng-compat-devel-0:2.2.5- 100% | 1.4 MiB/s | 38.3 KiB | 00m00s [133/138] rocm-llvm-libs-0:20-7.rocm7.1 100% | 29.8 MiB/s | 21.2 MiB | 00m01s [134/138] annobin-plugin-gcc-0:13.03-1. 100% | 16.7 MiB/s | 682.8 KiB | 00m00s [135/138] cmake-rpm-macros-0:3.31.6-4.f 100% | 592.3 KiB/s | 14.8 KiB | 00m00s [136/138] annobin-docs-0:13.03-1.fc44.n 100% | 3.4 MiB/s | 89.4 KiB | 00m00s [137/138] gcc-plugin-annobin-0:15.2.1-4 100% | 830.3 KiB/s | 59.0 KiB | 00m00s [138/138] rocm-llvm-0:20-7.rocm7.1.0.fc 100% | 30.7 MiB/s | 13.5 MiB | 00m00s -------------------------------------------------------------------------------- [138/138] Total 100% | 144.5 MiB/s | 537.4 MiB | 00m04s Running transaction [ 1/140] Verify package files 100% | 69.0 B/s | 138.0 B | 00m02s [ 2/140] Prepare transaction 100% | 1.4 KiB/s | 138.0 B | 00m00s [ 3/140] Installing cmake-filesystem-0 100% | 7.4 MiB/s | 7.6 KiB | 00m00s [ 4/140] Installing less-0:685-5.fc44. 100% | 33.9 MiB/s | 416.8 KiB | 00m00s [ 5/140] Installing libmpc-0:1.3.1-8.f 100% | 158.3 MiB/s | 162.1 KiB | 00m00s [ 6/140] Installing expat-0:2.7.2-1.fc 100% | 26.7 MiB/s | 300.7 KiB | 00m00s [ 7/140] Installing vim-filesystem-2:9 100% | 4.6 MiB/s | 4.7 KiB | 00m00s [ 8/140] Installing rocm-llvm-filesyst 100% | 9.3 MiB/s | 19.1 KiB | 00m00s [ 9/140] Installing rocm-libc++-0:20-7 100% | 51.5 MiB/s | 1.3 MiB | 00m00s [ 10/140] Installing rocm-llvm-libs-0:2 100% | 84.9 MiB/s | 91.6 MiB | 00m01s [ 11/140] Installing rocm-clang-libs-0: 100% | 84.4 MiB/s | 94.1 MiB | 00m01s [ 12/140] Installing rocm-comgr-0:20-7. 100% | 80.6 MiB/s | 126.3 MiB | 00m02s [ 13/140] Installing numactl-libs-0:2.0 100% | 9.4 MiB/s | 57.8 KiB | 00m00s [ 14/140] Installing groff-base-0:1.23. 100% | 132.6 MiB/s | 3.8 MiB | 00m00s [ 15/140] Installing emacs-filesystem-1 100% | 0.0 B/s | 544.0 B | 00m00s [ 16/140] Installing make-1:4.4.1-11.fc 100% | 112.5 MiB/s | 1.8 MiB | 00m00s [ 17/140] Installing rocm-lld-0:20-7.ro 100% | 73.6 MiB/s | 5.9 MiB | 00m00s [ 18/140] Installing rocm-libc++-devel- 100% | 128.0 MiB/s | 15.4 MiB | 00m00s [ 19/140] Installing cpp-0:15.2.1-4.fc4 100% | 391.3 MiB/s | 38.0 MiB | 00m00s [ 20/140] Installing zlib-ng-compat-dev 100% | 106.0 MiB/s | 108.5 KiB | 00m00s [ 21/140] Installing annobin-docs-0:13. 100% | 98.0 MiB/s | 100.3 KiB | 00m00s [ 22/140] Installing rocm-clang-runtime 100% | 148.8 MiB/s | 8.5 MiB | 00m00s [ 23/140] Installing kernel-headers-0:6 100% | 231.3 MiB/s | 6.9 MiB | 00m00s [ 24/140] Installing glibc-devel-0:2.42 100% | 215.1 MiB/s | 2.4 MiB | 00m00s [ 25/140] Installing libxcrypt-devel-0: 100% | 32.5 MiB/s | 33.3 KiB | 00m00s [ 26/140] Installing gcc-0:15.2.1-4.fc4 100% | 449.8 MiB/s | 112.0 MiB | 00m00s [ 27/140] Installing libstdc++-devel-0: 100% | 518.2 MiB/s | 37.3 MiB | 00m00s [ 28/140] Installing hwdata-0:0.401-1.f 100% | 600.9 MiB/s | 9.6 MiB | 00m00s [ 29/140] Installing libpciaccess-0:0.1 100% | 44.8 MiB/s | 45.9 KiB | 00m00s [ 30/140] Installing libdrm-0:2.4.128-3 100% | 197.1 MiB/s | 403.7 KiB | 00m00s [ 31/140] Installing rocm-runtime-0:7.1 100% | 536.7 MiB/s | 3.2 MiB | 00m00s [ 32/140] Installing rocm-runtime-devel 100% | 335.7 MiB/s | 687.6 KiB | 00m00s [ 33/140] Installing libtommath-0:1.3.1 100% | 124.5 MiB/s | 127.5 KiB | 00m00s [ 34/140] Installing tcl-1:9.0.2-1.fc44 100% | 188.6 MiB/s | 4.3 MiB | 00m00s [ 35/140] Installing procps-ng-0:4.0.4- 100% | 56.1 MiB/s | 1.0 MiB | 00m00s [ 36/140] Installing tzdata-0:2025b-3.f 100% | 70.1 MiB/s | 1.9 MiB | 00m00s [ 37/140] Installing python-pip-wheel-0 100% | 589.9 MiB/s | 1.2 MiB | 00m00s [ 38/140] Installing mpdecimal-0:4.0.1- 100% | 42.7 MiB/s | 218.8 KiB | 00m00s [ 39/140] Installing python3-libs-0:3.1 100% | 376.9 MiB/s | 43.3 MiB | 00m00s [ 40/140] Installing python3-0:3.14.0-2 100% | 2.5 MiB/s | 30.6 KiB | 00m00s [ 41/140] Installing cmake-rpm-macros-0 100% | 8.1 MiB/s | 8.3 KiB | 00m00s [ 42/140] Installing rocm-llvm-0:20-7.r 100% | 78.9 MiB/s | 52.5 MiB | 00m01s [ 43/140] Installing rocm-llvm-devel-0: 100% | 107.6 MiB/s | 28.7 MiB | 00m00s [ 44/140] Installing rocm-llvm-static-0 100% | 105.6 MiB/s | 1.9 GiB | 00m19s [ 45/140] Installing ncurses-0:6.5-7.20 100% | 43.0 MiB/s | 616.4 KiB | 00m00s [ 46/140] Installing perl-Digest-0:1.20 100% | 0.0 B/s | 37.1 KiB | 00m00s [ 47/140] Installing perl-Digest-MD5-0: 100% | 60.1 MiB/s | 61.6 KiB | 00m00s [ 48/140] Installing perl-B-0:1.89-520. 100% | 246.4 MiB/s | 504.7 KiB | 00m00s [ 49/140] Installing perl-FileHandle-0: 100% | 0.0 B/s | 9.8 KiB | 00m00s [ 50/140] Installing perl-libnet-0:3.15 100% | 287.8 MiB/s | 294.7 KiB | 00m00s [ 51/140] Installing perl-Data-Dumper-0 100% | 114.8 MiB/s | 117.5 KiB | 00m00s [ 52/140] Installing perl-MIME-Base32-0 100% | 0.0 B/s | 32.2 KiB | 00m00s [ 53/140] Installing perl-AutoLoader-0: 100% | 0.0 B/s | 21.0 KiB | 00m00s [ 54/140] Installing perl-URI-0:5.34-2. 100% | 137.6 MiB/s | 281.8 KiB | 00m00s [ 55/140] Installing perl-IO-Socket-IP- 100% | 99.8 MiB/s | 102.2 KiB | 00m00s [ 56/140] Installing perl-Net-SSLeay-0: 100% | 271.7 MiB/s | 1.4 MiB | 00m00s [ 57/140] Installing perl-IO-Socket-SSL 100% | 350.9 MiB/s | 718.6 KiB | 00m00s [ 58/140] Installing perl-Text-Tabs+Wra 100% | 0.0 B/s | 23.9 KiB | 00m00s [ 59/140] Installing perl-Pod-Escapes-1 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 60/140] Installing perl-File-Path-0:2 100% | 0.0 B/s | 64.5 KiB | 00m00s [ 61/140] Installing perl-if-0:0.61.000 100% | 0.0 B/s | 6.2 KiB | 00m00s [ 62/140] Installing perl-Time-Local-2: 100% | 68.9 MiB/s | 70.6 KiB | 00m00s [ 63/140] Installing perl-locale-0:1.13 100% | 0.0 B/s | 6.5 KiB | 00m00s [ 64/140] Installing perl-Pod-Simple-1: 100% | 280.7 MiB/s | 574.9 KiB | 00m00s [ 65/140] Installing perl-HTTP-Tiny-0:0 100% | 152.8 MiB/s | 156.4 KiB | 00m00s [ 66/140] Installing perl-File-Temp-1:0 100% | 161.6 MiB/s | 165.5 KiB | 00m00s [ 67/140] Installing perl-Class-Struct- 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 68/140] Installing perl-IPC-Open3-0:1 100% | 0.0 B/s | 28.5 KiB | 00m00s [ 69/140] Installing perl-Term-Cap-0:1. 100% | 0.0 B/s | 30.6 KiB | 00m00s [ 70/140] Installing perl-Term-ANSIColo 100% | 96.9 MiB/s | 99.2 KiB | 00m00s [ 71/140] Installing perl-POSIX-0:2.23- 100% | 227.2 MiB/s | 232.6 KiB | 00m00s [ 72/140] Installing perl-podlators-1:6 100% | 26.2 MiB/s | 321.4 KiB | 00m00s [ 73/140] Installing perl-Pod-Perldoc-0 100% | 13.8 MiB/s | 169.2 KiB | 00m00s [ 74/140] Installing perl-File-stat-0:1 100% | 0.0 B/s | 13.1 KiB | 00m00s [ 75/140] Installing perl-Socket-4:2.04 100% | 119.4 MiB/s | 122.3 KiB | 00m00s [ 76/140] Installing perl-SelectSaver-0 100% | 0.0 B/s | 2.6 KiB | 00m00s [ 77/140] Installing perl-Symbol-0:1.09 100% | 0.0 B/s | 7.3 KiB | 00m00s [ 78/140] Installing perl-Pod-Usage-4:2 100% | 7.8 MiB/s | 87.9 KiB | 00m00s [ 79/140] Installing perl-IO-0:1.55-520 100% | 148.1 MiB/s | 151.7 KiB | 00m00s [ 80/140] Installing perl-overloading-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [ 81/140] Installing perl-mro-0:1.29-52 100% | 0.0 B/s | 42.7 KiB | 00m00s [ 82/140] Installing perl-Fcntl-0:1.20- 100% | 0.0 B/s | 49.9 KiB | 00m00s [ 83/140] Installing perl-base-0:2.27-5 100% | 0.0 B/s | 13.0 KiB | 00m00s [ 84/140] Installing perl-Text-ParseWor 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 85/140] Installing perl-File-Basename 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 86/140] Installing perl-Getopt-Long-1 100% | 143.8 MiB/s | 147.2 KiB | 00m00s [ 87/140] Installing perl-Storable-1:3. 100% | 227.4 MiB/s | 232.8 KiB | 00m00s [ 88/140] Installing perl-overload-0:1. 100% | 0.0 B/s | 72.0 KiB | 00m00s [ 89/140] Installing perl-parent-1:0.24 100% | 0.0 B/s | 11.0 KiB | 00m00s [ 90/140] Installing perl-MIME-Base64-0 100% | 43.2 MiB/s | 44.3 KiB | 00m00s [ 91/140] Installing perl-vars-0:1.05-5 100% | 0.0 B/s | 4.3 KiB | 00m00s [ 92/140] Installing perl-Errno-0:1.38- 100% | 0.0 B/s | 8.8 KiB | 00m00s [ 93/140] Installing perl-constant-0:1. 100% | 0.0 B/s | 27.4 KiB | 00m00s [ 94/140] Installing perl-Scalar-List-U 100% | 145.2 MiB/s | 148.7 KiB | 00m00s [ 95/140] Installing perl-Getopt-Std-0: 100% | 0.0 B/s | 11.8 KiB | 00m00s [ 96/140] Installing perl-Encode-4:3.21 100% | 223.5 MiB/s | 4.7 MiB | 00m00s [ 97/140] Installing perl-DynaLoader-0: 100% | 0.0 B/s | 32.5 KiB | 00m00s [ 98/140] Installing perl-PathTools-0:3 100% | 180.2 MiB/s | 184.6 KiB | 00m00s [ 99/140] Installing perl-Exporter-0:5. 100% | 0.0 B/s | 55.6 KiB | 00m00s [100/140] Installing perl-Carp-0:1.54-5 100% | 23.3 MiB/s | 47.7 KiB | 00m00s [101/140] Installing perl-libs-4:5.42.0 100% | 332.8 MiB/s | 11.6 MiB | 00m00s [102/140] Installing perl-interpreter-4 100% | 10.7 MiB/s | 120.3 KiB | 00m00s [103/140] Installing perl-TermReadKey-0 100% | 64.6 MiB/s | 66.2 KiB | 00m00s [104/140] Installing perl-lib-0:0.65-52 100% | 0.0 B/s | 8.9 KiB | 00m00s [105/140] Installing perl-File-Copy-0:2 100% | 0.0 B/s | 20.2 KiB | 00m00s [106/140] Installing perl-File-Which-0: 100% | 30.7 MiB/s | 31.4 KiB | 00m00s [107/140] Installing perl-Error-1:0.170 100% | 78.1 MiB/s | 80.0 KiB | 00m00s [108/140] Installing libcbor-0:0.13.0-1 100% | 78.9 MiB/s | 80.8 KiB | 00m00s [109/140] Installing libfido2-0:1.16.0- 100% | 234.4 MiB/s | 240.1 KiB | 00m00s [110/140] Installing libpipeline-0:1.5. 100% | 15.9 MiB/s | 146.6 KiB | 00m00s [111/140] Installing man-db-0:2.13.1-2. 100% | 97.1 MiB/s | 2.9 MiB | 00m00s [112/140] Installing environment-module 100% | 78.6 MiB/s | 1.9 MiB | 00m00s [113/140] Installing openssh-0:10.0p1-8 100% | 99.4 MiB/s | 1.4 MiB | 00m00s [114/140] Installing libedit-0:3.1-57.2 100% | 236.2 MiB/s | 241.8 KiB | 00m00s [115/140] Installing openssh-clients-0: 100% | 125.0 MiB/s | 2.6 MiB | 00m00s [116/140] Installing git-core-0:2.51.1- 100% | 381.9 MiB/s | 23.7 MiB | 00m00s [117/140] Installing git-core-doc-0:2.5 100% | 416.3 MiB/s | 17.9 MiB | 00m00s [118/140] Installing git-0:2.51.1-1.fc4 100% | 0.0 B/s | 57.7 KiB | 00m00s [119/140] Installing perl-Git-0:2.51.1- 100% | 63.8 MiB/s | 65.4 KiB | 00m00s [120/140] Installing rocm-clang-0:20-7. 100% | 85.8 MiB/s | 68.5 MiB | 00m01s [121/140] Installing rocm-clang-devel-0 100% | 138.2 MiB/s | 26.3 MiB | 00m00s [122/140] Installing rocm-device-libs-0 100% | 105.2 MiB/s | 3.3 MiB | 00m00s [123/140] Installing rocm-comgr-devel-0 100% | 99.5 MiB/s | 101.9 KiB | 00m00s [124/140] Installing hipcc-0:20-7.rocm7 100% | 34.5 MiB/s | 635.9 KiB | 00m00s [125/140] Installing rocm-hip-0:7.1.0-1 100% | 409.2 MiB/s | 27.0 MiB | 00m00s [126/140] Installing rhash-0:1.4.5-3.fc 100% | 24.9 MiB/s | 356.4 KiB | 00m00s [127/140] Installing libuv-1:1.51.0-2.f 100% | 279.8 MiB/s | 573.0 KiB | 00m00s [128/140] Installing jsoncpp-0:1.9.6-2. 100% | 253.1 MiB/s | 259.2 KiB | 00m00s [129/140] Installing cmake-0:3.31.6-4.f 100% | 352.1 MiB/s | 34.5 MiB | 00m00s [130/140] Installing cmake-data-0:3.31. 100% | 133.3 MiB/s | 9.1 MiB | 00m00s [131/140] Installing pcre2-utf32-0:10.4 100% | 298.8 MiB/s | 611.9 KiB | 00m00s [132/140] Installing fdupes-1:2.4.0-2.f 100% | 6.9 MiB/s | 120.0 KiB | 00m00s [133/140] Installing rocm-cmake-0:7.1.0 100% | 131.5 MiB/s | 134.6 KiB | 00m00s [134/140] Installing rocm-hip-devel-0:7 100% | 152.7 MiB/s | 2.4 MiB | 00m00s [135/140] Installing rocm-rpm-macros-0: 100% | 0.0 B/s | 19.5 KiB | 00m00s [136/140] Installing ninja-build-0:1.13 100% | 39.4 MiB/s | 483.8 KiB | 00m00s [137/140] Installing gcc-c++-0:15.2.1-4 100% | 394.0 MiB/s | 41.4 MiB | 00m00s [138/140] Installing annobin-plugin-gcc 100% | 56.8 MiB/s | 697.4 KiB | 00m00s [139/140] Installing gcc-plugin-annobin 100% | 5.2 MiB/s | 58.8 KiB | 00m00s [140/140] Installing rocm-compilersuppo 100% | 3.9 KiB/s | 440.0 B | 00m00s Warning: skipped OpenPGP checks for 27 packages from repository: copr_base Complete! Finish: build setup for composable_kernel-7.1.0-2.fc44.src.rpm Start: rpmbuild composable_kernel-7.1.0-2.fc44.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1763424000 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.zWdx3I Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.QSDNpu + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + rm -rf composable_kernel-rocm-7.1.0 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/composable_kernel-7.1.0.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd composable_kernel-rocm-7.1.0 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/0001-composable_kernel-per-dir-build.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + sed -i -e 's@add_compile_options(-Werror)@#add_compile_options(-Werror)@' CMakeLists.txt + sed -i -e /-Werror/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@add_compile_options(-Weverything)@#add_compile_options(-Weverything)@' CMakeLists.txt + sed -i -e /-Wextra/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Wunused/d cmake/EnableCompilerWarnings.cmake + sed -i -e /-Weverything/d cmake/EnableCompilerWarnings.cmake + sed -i -e 's@-Wno-unknown-warning-option@-Wno-unknown-warning-option -Wno-unused-parameter@' cmake/EnableCompilerWarnings.cmake + sed -i -e 's@CK_TIME_KERNEL 1@CK_TIME_KERNEL 0@' include/ck/ck.hpp + sed -i -e 's@add_subdirectory(example)@#add_subdirectory(example)@' CMakeLists.txt + sed -i -e 's@add_subdirectory(profiler)@#add_subdirectory(profiler)@' CMakeLists.txt + sed -i -e s@STATIC@SHARED@ library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + sed -i -e 's@POSITION_INDEPENDENT_CODE ON@POSITION_INDEPENDENT_CODE ON SOVERSION \"7.1.0\"@' library/src/utility/CMakeLists.txt library/src/tensor_operation_instance/gpu/CMakeLists.txt + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.gF7Wp3 + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd composable_kernel-rocm-7.1.0 ++ cat /proc/cpuinfo ++ grep -m 1 'cpu cores' ++ awk '{ print $4 }' + COMPILE_JOBS=2 + '[' 2x = x ']' + '[' 2 = 1 ']' + BUILD_MEM=6 + MEM_KB=0 ++ cat /proc/meminfo ++ grep MemTotal ++ awk '{ print $2 }' + MEM_KB=7953336 ++ eval 'expr 7953336 / 1024' +++ expr 7953336 / 1024 + MEM_MB=7766 ++ eval 'expr 7766 / 1024' +++ expr 7766 / 1024 + MEM_GB=7 ++ eval 'expr 1 + 7 / 6' +++ expr 1 + 7 / 6 + COMPILE_JOBS_MEM=2 + '[' 2 -lt 2 ']' + LINK_MEM=12 ++ eval 'expr 1 + 7 / 12' +++ expr 1 + 7 / 12 + LINK_JOBS=1 + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + /usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON -G Ninja -DBUILD_TESTING=OFF -DCK_BUILD_DEVICE_CONV=ON -DCK_BUILD_DEVICE_CONTRACTION=ON '-DCK_BUILD_DEVICE_GEMM=%{build_ck_gem}' -DCK_BUILD_DEVICE_MHA=ON -DCK_BUILD_DEVICE_OTHER=ON -DCK_BUILD_DEVICE_REDUCTION=ON -DCK_PARALLEL_COMPILE_JOBS=2 -DCK_PARALLEL_LINK_JOBS=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_CXX_FLAGS=-fuse-ld=bfd -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF '-DCMAKE_HIP_ARCHITECTURES=gfx11-generic;gfx12-generic' -DCMAKE_HIP_COMPILER=/usr/lib64/rocm/llvm/bin/clang++ -DCMAKE_INSTALL_LIBDIR=/usr/lib64 -DENABLE_CLANG_CPP_CHECKS=OFF '-DGPU_ARCHS=gfx11-generic;gfx12-generic' -DHIP_PLATFORM=amd -DROCM_SYMLINK_LIBS=OFF -- The CXX compiler identification is Clang 20.0.0 -- The HIP compiler identification is Clang 20.0.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting HIP compiler ABI info -- Detecting HIP compiler ABI info - done -- Check for working HIP compiler: /usr/lib64/rocm/llvm/bin/clang++ - skipped -- Detecting HIP compile features -- Detecting HIP compile features - done -- Found Python3: /usr/bin/python3.14 (found suitable version "3.14.0", minimum required is "3.8") found components: Interpreter -- Found Git: /usr/bin/git (found version "2.51.1") fatal: not a git repository (or any of the parent directories): .git CMake Deprecation Warning at /usr/share/rocm/cmake/ROCMConfig.cmake:12 (message): Use of find_package(ROCM) is deprecated as of ROCm 6.4. Please use find_package(ROCmCMakeBuildTools) Call Stack (most recent call first): CMakeLists.txt:148 (find_package) -- GPU_TARGETS= -- GPU_ARCHS= gfx11-generic;gfx12-generic -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success -- hip_version_flat=700125436 -- checking which targets are supported -- Performing Test COMPILER_HAS_TARGET_ID_gfx11_generic -- Performing Test COMPILER_HAS_TARGET_ID_gfx11_generic - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx12_generic -- Performing Test COMPILER_HAS_TARGET_ID_gfx12_generic - Success -- Building CK for the following targets: gfx11-generic;gfx12-generic -- Enabling WMMA instances -- Enabling WMMA FP8 gemms on native architectures -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK -- Performing Test HAS_NO_OFFLOAD_UNIFORM_BLOCK - Success -- Adding the fno-offload-uniform-block compiler flag -- Performing Test HAS_LSR_DROP_SOLUTION -- Performing Test HAS_LSR_DROP_SOLUTION - Success -- Adding the lsr-drop-solution=1 compiler flag -- Performing Test HAS_ENABLE_POST_MISCHED -- Performing Test HAS_ENABLE_POST_MISCHED - Success -- Adding the enable-post-misched=0 compiler flag -- Performing Test check-coerce -- Performing Test check-coerce - Success -- Adding the amdgpu-coerce-illegal-types=1 -- Adding -amdgpu-early-inline-all=true and -amdgpu-function-calls=false -- CMAKE_CXX_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ -- CMAKE_HIP_COMPILER: /usr/lib64/rocm/llvm/bin/clang++ -- OpenMP_CXX_LIB_NAMES: libomp;libgomp;libiomp5 -- OpenMP_gomp_LIBRARY: -- OpenMP_pthread_LIBRARY: -- OpenMP_CXX_FLAGS: -fopenmp=libomp -Wno-unused-command-line-argument -- Build with HIP -- CMAKE_CXX_FLAGS: -fuse-ld=bfd -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_bf16_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_f16_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_ngchw_gkcyx_ngkhw_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f32_mem_inter_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f32_mem_intra_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ndhwgc_gkzyxc_ndhwgk_f16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_bf16_comp_instances -- Generating sharded instantiations for target: device_grouped_conv3d_fwd_xdl_ngcdhw_gkczyx_ngkdhw_f16_comp_instances -- Could NOT find Python3 (missing: Python3_INCLUDE_DIRS Python3_LIBRARIES Development Development.Module Development.Embed) (found version "3.14.0") -- Configuring done (7.2s) -- Generating done (0.6s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_CXX_FLAGS_RELEASE CMAKE_C_FLAGS_RELEASE CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP CMAKE_VERBOSE_MAKEFILE LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build + /usr/bin/cmake --build redhat-linux-build --verbose Change Dir: '/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build' Run Build Command(s): /usr/bin/ninja-build -v [1/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f16_instance.cpp [2/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp [3/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f32_instance.cpp [4/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_f8_instance.cpp [5/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/device_avg_pool2d_bwd_nhwc_int8_instance.cpp [6/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp [7/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp [8/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp [9/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp [10/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp [11/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp [12/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp [13/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp [14/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp [15/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp [16/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp [17/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp [18/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp [19/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp [20/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp [21/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp [22/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp [23/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp [24/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp [25/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp [26/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp [27/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp [28/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp [29/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp [30/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp [31/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp [32/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f16_instance.cpp [33/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp [34/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f32_instance.cpp [35/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_bf16_instance.cpp [36/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_forward_f64_instance.cpp [37/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f16_instance.cpp [38/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f32_instance.cpp [39/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_bf16_instance.cpp [40/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f64_instance.cpp [41/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f16_instance.cpp [42/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f32_instance.cpp [43/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_bf16_instance.cpp [44/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_infer_f64_instance.cpp [45/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gnwc_1d_instance.cpp [46/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gnhwc_2d_instance.cpp [47/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_gndhwc_3d_instance.cpp [48/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_nwgc_1d_instance.cpp [49/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_nhwgc_2d_instance.cpp [50/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/column_to_image/device_column_to_image_ndhwgc_3d_instance.cpp [51/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [52/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [53/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx11-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for gfx12-generic. /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/conv2d_bwd_data/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp:75:9: warning: These instances are getting deprecated [-W#pragma-messages] 75 | #pragma message "These instances are getting deprecated" | ^ 1 warning generated when compiling for host. [54/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/elementwise/device_normalize_instance.cpp [55/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/elementwise_normalization/device_elementwise_normalization_f16_instance.cpp [56/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp [57/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp [58/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp [59/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp [60/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp [61/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp [62/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp [63/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp [64/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp [65/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp [66/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp [67/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp [68/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp [69/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp [70/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp [71/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp [72/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp [73/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp [74/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp [75/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp [76/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp [77/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp [78/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp [79/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp [80/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp [81/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp [82/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp [83/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp [84/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp [85/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp [86/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp [87/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp [88/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp [89/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp [90/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp [91/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp [92/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp [93/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp [94/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp [95/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp [96/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp [97/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp [98/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp [99/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp [100/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp [101/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp [102/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp [103/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp [104/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp [105/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp [106/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp [107/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp [108/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp [109/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp [110/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp [111/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp [112/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp [113/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp [114/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp [115/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp [116/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp [117/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp [118/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp [119/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp [120/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp [121/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp [122/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp [123/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp [124/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp [125/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp [126/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp [127/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp [128/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp [129/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_b_scale/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp [130/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp [131/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp [132/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_bilinear/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp [133/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp [134/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp [135/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp [136/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [137/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp [138/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp [139/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp [140/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [141/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp [142/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp [143/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp [144/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp [145/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp [146/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp [147/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp [148/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp [149/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp [150/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp [151/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp [152/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp [153/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp [154/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp [155/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp [156/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp [157/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp [158/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp [159/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp [160/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp [161/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp [162/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp [163/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp [164/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp [165/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp [166/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp [167/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp [168/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp [169/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp [170/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp [171/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp [172/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp [173/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [174/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp [175/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp [176/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [177/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp [178/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp [179/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp [180/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp [181/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp [182/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp [183/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp [184/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp [185/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp [186/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp [187/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp [188/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp [189/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp [190/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp [191/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp [192/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp [193/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp [194/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp [195/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp [196/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp [197/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp [198/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp [199/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp [200/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp [201/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp [202/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp [203/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp [204/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp [205/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp [206/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp [207/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -mllvm -greedy-reverse-local-assignment=1 -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp [208/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/gemm_universal/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp [209/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp [210/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp [211/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp [212/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp [213/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp [214/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp [215/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp [216/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp [217/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp [218/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp [219/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp [220/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp [221/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp [222/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp [223/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp [224/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp [225/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp [226/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp [227/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp [228/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp [229/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp [230/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp [231/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp [232/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp [233/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp [234/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp [235/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp [236/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp [237/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp [238/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp [239/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp [240/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp [241/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp [242/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp [243/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp [244/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp [245/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp [246/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp [247/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp [248/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp [249/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [250/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [251/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [252/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [253/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [254/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [255/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [256/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [257/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [258/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp [259/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp [260/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [261/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp [262/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp [263/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [264/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [265/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [266/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [267/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [268/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [269/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [270/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [271/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp [272/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp [273/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp [274/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp [275/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp [276/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp [277/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp [278/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp [279/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp [280/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp [281/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp [282/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp [283/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp [284/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp [285/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp [286/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gnwc_1d_instance.cpp [287/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gnhwc_2d_instance.cpp [288/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_gndhwc_3d_instance.cpp [289/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_nwgc_1d_instance.cpp [290/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_nhwgc_2d_instance.cpp [291/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/image_to_column/device_image_to_column_ndhwgc_3d_instance.cpp [292/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f16_instance.cpp [293/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_bf16_instance.cpp [294/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f32_instance.cpp [295/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_f8_instance.cpp [296/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/max_pool_bwd/device_max_pool_bwd_int8_instance.cpp [297/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_groupnorm_bwd_data_f32_instance.cpp [298/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_layernorm2d_bwd_data_f16_instance.cpp [299/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_data/device_layernorm2d_bwd_data_f32_instance.cpp [300/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_groupnorm_bwd_gamma_beta_f32_instance.cpp [301/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp [302/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp [303/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp [304/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_conv_operations.dir/link.d -shared -Wl,-soname,libdevice_conv_operations.so.1 -o lib/libdevice_conv_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/conv2d_bwd_data/CMakeFiles/device_conv2d_bwd_data_instance.dir/device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_gnwc_gkxc_gnwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv1d_bwd_weight/CMakeFiles/device_grouped_conv1d_bwd_weight_instance.dir/dl/device_grouped_conv1d_bwd_weight_dl_nwgc_gkxc_nwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_data/CMakeFiles/device_grouped_conv2d_bwd_data_instance.dir/wmma/device_grouped_conv2d_bwd_data_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_gnhwc_gkyxc_gnhwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_bwd_weight/CMakeFiles/device_grouped_conv2d_bwd_weight_instance.dir/dl/device_grouped_conv2d_bwd_weight_dl_nhwgc_gkyxc_nhwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_gnhwc_gkyxc_gnhwk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/wmma/device_grouped_conv2d_fwd_wmma_nhwgc_gkyxc_nhwgk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_data/CMakeFiles/device_grouped_conv3d_bwd_data_instance.dir/wmma/device_grouped_conv3d_bwd_data_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_gndhwc_gkzyxc_gndhwk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/dl/device_grouped_conv3d_bwd_weight_dl_ndhwgc_gkzyxc_ndhwgk_bf16_f32_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_bwd_weight/CMakeFiles/device_grouped_conv3d_bwd_weight_instance.dir/wmma/device_grouped_conv3d_bwd_weight_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_1x1s1p0_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_gndhwc_gkzyxc_gndhwk_i8_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_f16_oddc_instance.cpp.o library/src/tensor_operation_instance/gpu/grouped_conv3d_fwd/CMakeFiles/device_grouped_conv3d_fwd_instance.dir/wmma/device_grouped_conv3d_fwd_wmma_ndhwgc_gkzyxc_ndhwgk_i8_oddc_instance.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [305/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_conv_operations.so.1.1.0 lib/libdevice_conv_operations.so.1 lib/libdevice_conv_operations.so && : [306/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm2d_fwd_f16_instance.cpp [307/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm4d_fwd_f16_instance.cpp [308/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_f16_instance.cpp [309/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f16_instance.cpp [310/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm2d_fwd_f32_instance.cpp [311/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp [312/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_layernorm4d_fwd_f32_instance.cpp [313/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_f32_instance.cpp [314/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_1d_fp16_instances.cpp [315/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/normalization_fwd/device_groupnorm_fwd_swish_f32_instance.cpp [316/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_2d_fp16_instances.cpp [317/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_3d_fp16_instances.cpp [318/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_4d_fp16_instances.cpp [319/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_5d_fp16_instances.cpp [320/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp16_instances.cpp [321/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_1d_fp32_instances.cpp [322/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_2d_fp32_instances.cpp [323/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_3d_fp32_instances.cpp [324/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_4d_fp32_instances.cpp [325/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_5d_fp32_instances.cpp [326/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp32_instances.cpp [327/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f16_instance.cpp [328/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f16_instance.cpp [329/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f32_instance.cpp [330/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f32_instance.cpp [331/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp [332/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o -MF library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o.d -o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/permute_scale/device_permute_scale_6d_fp32_fp8_instances.cpp [333/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_bf16_instance.cpp [334/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_i8_instance.cpp [335/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_i8_instance.cpp [336/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_avg_pool2d_fwd_nhwc_f8_instance.cpp [337/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool2d_fwd/device_max_pool2d_fwd_nhwc_f8_instance.cpp [338/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp [339/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f16_instance.cpp [340/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f8_instance.cpp [341/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp [342/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_i8_instance.cpp [343/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp [344/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp [345/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_f32_instance.cpp [346/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp [347/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/pool3d_fwd/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp [348/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp [349/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp [350/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp [351/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp [352/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp [353/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp [354/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp [355/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/quantization/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp [356/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_min.cpp [357/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_gemm_operations.dir/link.d -shared -Wl,-soname,libdevice_gemm_operations.so.1 -o lib/libdevice_gemm_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_wmma_universal_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gmk_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_f16_f16_f16_gkm_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gmk_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gkn_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/batched_gemm_multi_d/CMakeFiles/device_batched_gemm_multi_d_instance.dir/device_batched_gemm_multi_d_dl_i8_i8_i8_gkm_gnk_gmn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/km_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_kn_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_add_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_default_pipeline_v2_opt_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_default_pipeline_v2_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_xdl_f16_f16_f16/mk_nk_mn_irregular_interwave_pipeline_v1_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_f16_f16_f16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_bf16_bf16_bf16_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_mk_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_wmma_int8_int8_int8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_b_scale/CMakeFiles/device_gemm_b_scale_instance.dir/device_gemm_b_scale_wmma_f16_i4_f16/device_gemm_b_scale_wmma_f16_i4_f16_mk_nk_mn_mem_v2_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_kn_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_km_nk_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_kn_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_bilinear/CMakeFiles/device_gemm_bilinear_instance.dir/device_gemm_bilinear_wmma_c_shuffle_i8_i8_i8_i8_mk_nk_mn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f16_f16/device_gemm_wmma_universal_f16_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_bf16_bf16/device_gemm_wmma_universal_bf16_bf16_bf16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_bf16_i4_bf16/device_gemm_wmma_universal_bf16_i4_bf16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_i4_f16/device_gemm_wmma_universal_f16_i4_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f16_f16/device_gemm_wmma_universal_f8_f16_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f16_f8_f16/device_gemm_wmma_universal_f16_f8_f16_km_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_kn_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_default_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_kpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/gemm_universal/CMakeFiles/device_gemm_universal_instance.dir/device_gemm_wmma_universal_f8_f8_bf16/device_gemm_wmma_universal_f8_f8_bf16_mk_nk_mn_comp_mnkpadding_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp.o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [358/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_gemm_operations.so.1.1.0 lib/libdevice_gemm_operations.so.1 lib/libdevice_gemm_operations.so && : [359/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_max.cpp [360/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp [361/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_add.cpp [362/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp [363/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp [364/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_add.cpp [365/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp [366/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp [367/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_min.cpp [368/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_max.cpp [369/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_add.cpp [370/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp [371/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp [372/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_add.cpp [373/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp [374/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp [375/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp [376/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_min.cpp [377/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_max.cpp [378/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_add.cpp [379/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp [380/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp [381/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_min.cpp [382/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_max.cpp [383/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_add.cpp [384/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp [385/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp [386/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp [387/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_min.cpp [388/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_max.cpp [389/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_min.cpp [390/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_max.cpp [391/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp [392/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_add.cpp [393/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp [394/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp [395/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp [396/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_add.cpp [397/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp [398/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp [399/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_min.cpp [400/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_max.cpp [401/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_add.cpp [402/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp [403/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp [404/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp [405/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_add.cpp [406/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp [407/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp [408/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_min.cpp [409/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_max.cpp [410/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp [411/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_add.cpp [412/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp [413/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_min.cpp [414/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_max.cpp [415/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp [416/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_add.cpp [417/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp [418/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp [419/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_min.cpp [420/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_max.cpp [421/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp [422/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp [423/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp [424/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp [425/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp [426/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp [427/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp [428/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp [429/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp [430/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp [431/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o -MF library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o.d -o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp [432/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce1.cpp [433/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce2.cpp [434/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce3.cpp [435/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce1.cpp [436/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce2.cpp [437/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce3.cpp [438/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce1.cpp [439/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce4.cpp [440/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce2.cpp [441/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce3.cpp [442/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce1.cpp [443/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce2.cpp [444/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce3.cpp [445/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -MF library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o.d -o library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/transpose/device_transpose_instances_3d.cpp [446/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/device_memory.cpp [447/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/host_tensor.cpp [448/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dutility_EXPORTS -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -x hip -MD -MT library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -MF library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o.d -o library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/utility/convolution_parameter.cpp [449/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_other_operations.dir/link.d -shared -Wl,-soname,libdevice_other_operations.so.1 -o lib/libdevice_other_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool2d_bwd/CMakeFiles/device_avg_pool2d_bwd_instance.dir/device_avg_pool2d_bwd_nhwc_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_infer_f64_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnwc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gnhwc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_gndhwc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nwgc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_nhwgc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/column_to_image/CMakeFiles/device_column_to_image_instance.dir/device_column_to_image_ndhwgc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/elementwise/CMakeFiles/device_elementwise_instance.dir/device_normalize_instance.cpp.o library/src/tensor_operation_instance/gpu/elementwise_normalization/CMakeFiles/device_elementwise_normalization_instance.dir/device_elementwise_normalization_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnwc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gnhwc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_gndhwc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nwgc_1d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_nhwgc_2d_instance.cpp.o library/src/tensor_operation_instance/gpu/image_to_column/CMakeFiles/device_image_to_column_instance.dir/device_image_to_column_ndhwgc_3d_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/max_pool_bwd/CMakeFiles/device_max_pool_bwd_instance.dir/device_max_pool_bwd_int8_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_groupnorm_bwd_data_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_data/CMakeFiles/device_normalization_bwd_data_instance.dir/device_layernorm2d_bwd_data_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_groupnorm_bwd_gamma_beta_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_bwd_gamma_beta/CMakeFiles/device_normalization_bwd_gamma_beta_instance.dir/device_layernorm2d_bwd_gamma_beta_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f16_f32_f32_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm2d_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_layernorm4d_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/normalization_fwd/CMakeFiles/device_normalization_fwd_instance.dir/device_groupnorm_fwd_swish_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp16_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_1d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_2d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_3d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_4d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_5d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_instances.cpp.o library/src/tensor_operation_instance/gpu/permute_scale/CMakeFiles/device_permute_scale_instance.dir/device_permute_scale_6d_fp32_fp8_instances.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_avg_pool2d_fwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool2d_fwd/CMakeFiles/device_pool2d_fwd_instance.dir/device_max_pool2d_fwd_nhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_i8_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_f32_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_avg_pool3d_fwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/pool3d_fwd/CMakeFiles/device_pool3d_fwd_instance.dir/device_max_pool3d_fwd_ndhwc_bf16_instance.cpp.o library/src/tensor_operation_instance/gpu/transpose/CMakeFiles/device_transpose_instance.dir/device_transpose_instances_3d.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [450/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_other_operations.so.1.1.0 lib/libdevice_other_operations.so.1 lib/libdevice_other_operations.so && : [451/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/utility/CMakeFiles/utility.dir/link.d -shared -Wl,-soname,libutility.so.1 -o lib/libutility.so.1.1.0 library/src/utility/CMakeFiles/utility.dir/device_memory.cpp.o library/src/utility/CMakeFiles/utility.dir/host_tensor.cpp.o library/src/utility/CMakeFiles/utility.dir/convolution_parameter.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [452/455] /usr/bin/cmake -E cmake_symlink_library lib/libutility.so.1.1.0 lib/libutility.so.1 lib/libutility.so && : [453/455] /usr/lib64/rocm/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DDL_KERNELS -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/include -I/builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/redhat-linux-build/include -fuse-ld=bfd -O2 -g -DNDEBUG -std=c++20 -fPIC -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-parameter -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-coerce-illegal-types=1 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fcolor-diagnostics -gsplit-dwarf --offload-compress -x hip --offload-arch=gfx11-generic --offload-arch=gfx12-generic -MD -MT library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -MF library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o.d -o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -c /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/library/src/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce4.cpp [454/455] : && /usr/lib64/rocm/llvm/bin/clang++ -fPIC -fuse-ld=bfd -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Xlinker --dependency-file=library/src/tensor_operation_instance/gpu/CMakeFiles/device_reduction_operations.dir/link.d -shared -Wl,-soname,libdevice_reduction_operations.so.1 -o lib/libdevice_reduction_operations.so.1.1.0 library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f16_f16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f16_f32_f16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f32_f32_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f32_f64_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_f64_f64_f64_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i32_i8_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_i8_i8_i8_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_blockwise_b16_f32_b16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f16_f16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f16_f32_f16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f32_f32_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f32_f64_f32_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_f64_f64_f64_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i32_i8_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_i8_i8_i8_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_norm2.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_min.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_max.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_threadwise_b16_f32_b16_amax.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.cpp.o library/src/tensor_operation_instance/gpu/reduce/CMakeFiles/device_reduce_instance.dir/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank3_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f16_f16_instance_rank4_reduce4.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank3_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce1.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce2.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce3.cpp.o library/src/tensor_operation_instance/gpu/softmax/CMakeFiles/device_softmax_instance.dir/device_softmax_f32_f32_instance_rank4_reduce4.cpp.o -Wl,-rpath,::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib64/libamdhip64.so.7.1.25436 --hip-link /usr/lib64/rocm/llvm/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a && : clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] [455/455] /usr/bin/cmake -E cmake_symlink_library lib/libdevice_reduction_operations.so.1.1.0 lib/libdevice_reduction_operations.so.1 lib/libdevice_reduction_operations.so && : + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.4xAUer + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + '[' /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT ++ dirname /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build + mkdir /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=thin -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd composable_kernel-rocm-7.1.0 + DESTDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build -- Install configuration: "RelWithDebInfo" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/remod.py -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref/naive_attention.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ref/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/pipeline/topk_softmax_warp_per_row_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax/kernel/topk_softmax_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block/block_topk_stream_2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk/block/block_topk_stream_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/topk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block/block_softmax_2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax/block/block_softmax_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel/smoothquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant/kernel/moe_smoothquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/smoothquant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_model_sensitive_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/pipeline/rmsnorm2d_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d/kernel/rmsnorm2d_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rmsnorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce/block/block_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/pipeline/generic_petmute_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute/kernel/generic_permute_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/thread -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/thread/thread_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block/block_norm_reduce_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce/block/block_norm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/norm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_two_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/pipeline/layernorm2d_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d/kernel/layernorm2d_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/layernorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline/tile_image_to_column_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/pipeline/block_image_to_column_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column/kernel/image_to_column_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/image_to_column.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/transform_conv_fwd_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/transform_conv_bwd_weight_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/grouped_convolution_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/utils/convolution_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel/grouped_convolution_forward_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution/kernel/grouped_convolution_backward_weight_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/grouped_convolution.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/tile_gemm_aquant_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_group_quant_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/pipeline/gemm_aquant_pipeline_ag_bg_cr_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/kernel/gemm_aquant_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant/block/block_universal_gemm_as_aquant_bs_cr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm_group_quant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_smfmac_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_dispatcher.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_smfmac_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_smfmac.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_mfma_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm_attribute_mfma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/warp/warp_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/wp_pipeline_agmem_bgmem_creg_base_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/tile_gemm_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/tile_gemm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_universal_pipeline_ag_bg_cr_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v2_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_mem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v5_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v5.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v4_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_comp_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/pipeline/gemm_pipeline_ag_bg_cr_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/universal_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/grouped_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_multi_d_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/kernel/batched_gemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_wp_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_wp_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_universal_gemm_as_bs_cr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_asmem_breg_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bsmem_creg_one_warp_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_breg_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bgmem_creg_v1_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm/block/block_gemm_areg_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/moe_sorting_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/moe_sorting_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_uk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/pipeline/fused_moegemm_pipeline_flatmm_ex.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/moe_sorting_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/moe_sorting_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe/kernel/fused_moegemm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fused_moe.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/tile_fmha_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/tile_fmha_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qx_ks_vs_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qs_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qs_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_whole_k_prefetch_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_whole_k_prefetch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_fp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_async_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_async.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_nwarp_sshuffle_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_nwarp_sshuffle_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_combine_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_combine_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_pagedkv_pipeline_qr_ks_vs_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_pagedkv_pipeline_qr_ks_vs.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_appendkv_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_appendkv_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dq_dk_dv_pipeline_kr_ktr_vr_iglp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dq_dk_dv_pipeline_kr_ktr_vr.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_dot_do_o.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_bwd_convert_dq.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_batch_prefill_pipeline_qr_ks_vs_async_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/pipeline/block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_splitkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_splitkv_combine_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_pagedkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_appendkv_tile_partitioner.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_fwd_appendkv_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_bwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/kernel/fmha_batch_prefill_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/variants.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/page_block_navigator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_rotary_embedding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_position_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_masking.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_dropout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha/block/block_attention_bias_enum.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/tile_flatmm_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/flatmm_pipeline_agmem_bgmem_creg_v1_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/pipeline/flatmm_pipeline_agmem_bgmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/kernel/flatmm_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_uk_gfx9_32x512x128_1x1x1_16x16x16.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_sn_uk_gfx9_32x128x512_1x4x1_16x16x16_itl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/flatmm_sn_uk_gfx9_32x128x512_1x4x1_16x16x16.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/uk/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_uk_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_sn_32x128x512_1x4x1_16x16x32_itl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_sn_32x128x512_1x4x1_16x16x32.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/flatmm_32x512x128_1x4x1_16x16x32.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/block_flatmm_asmem_bsmem_creg_v1_custom_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm/block/block_flatmm_asmem_bsmem_creg_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/flatmm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/dynamic_quant_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/default_2d_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/default_2d_and_dynamic_quant_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue/cshuffle_epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/epilogue.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/unary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/pipeline/elementwise_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/kernel/elementwise_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise/binary_elementwise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/tensor_layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/generic_2d_block_shape.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_lds_pipeline.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/pipeline/batched_transpose_common_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose/kernel/batched_transpose_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/batched_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_three_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_problem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_one_pass.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/pipeline/add_rmsnorm2d_rdquant_fwd_pipeline_default_policy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/kernel -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant/kernel/add_rmsnorm2d_rdquant_fwd_kernel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/add_rmsnorm2d_rdquant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/timer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/stream_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/stream_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/rotating_buffers.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_topk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_rowwise_quantization2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_rmsnorm2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_moe_sorting.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_layernorm2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_im2col.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_grouped_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_grouped_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_fused_moe.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_rotary_position_embedding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_masking.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/reference/reference_batched_dropout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/ranges.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/kernel_launch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/joinable_thread.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/host_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/hip_check_error.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/flush_icache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/fill.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/device_prop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/device_memory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/convolution_parameter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/convolution_host_tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/concat.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/check_err.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host/arg_parser.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/host.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/unary_element_function.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/type_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/transpose_vectors.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/to_sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/static_counter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/reduce_operator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/random.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/philox_rand.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/magic_div.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/literals.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/ignore.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/functional_with_tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/functional.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/env.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/debug.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/utility/bit_cast.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/update_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/transpose_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_linear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_window.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_scatter_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_distribution_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tile_distribution.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_view.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_coordinate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_adaptor_coordinate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/tensor_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/sweep_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/store_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/static_distributed_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/slice_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/shuffle_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/null_tile_window.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/null_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/load_tile_transpose.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/load_tile.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/tensor/buffer_view.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/vector_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/pk_int4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/pk_fp4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/numeric.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/null_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/mxfp_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/math.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/integral_constant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/integer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/int8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/float8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/numeric/bfloat16.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/thread_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/statically_indexed_array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/span.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/meta_data_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/map.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/container_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/container/array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/workgroup_barrier.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/utility.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/generic_memory_space_atomic.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/arch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_transpose_load_encoding.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_buffer_addressing_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/arch/amd_buffer_addressing.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/static_encoding_pattern.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/space_filling_curve.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/indexing_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/coordinate_transform.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/algorithm/cluster_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/core.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/README.md -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/bias.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/fmha_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/mask.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/rotary.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck_tile/ops/utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_other_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_other_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_gemm_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_gemm_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_conv_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_conv_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_reduction_operationsTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kerneldevice_reduction_operationsTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/ck.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/tensor_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/tensor_partition.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/layout_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/utils/kernel_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/traits -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/traits/blockwise_gemm_xdl_traits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/operations/copy.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/wrapper/layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/version.h.in -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/workgroup_synchronization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/workgroup_barrier.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/tuple_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/transpose_vectors.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/thread_group.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/synchronization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/statically_indexed_array_multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/statically_indexed_array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/static_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/span.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/sequence_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/sequence.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/scaled_type_convert.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_operator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_functions_accumulate.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_enums.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/reduction_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/random_gen.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/numeric_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/numeric_limits.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/number.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxfp_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf8_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf6_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/mxf4_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/math_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/math.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/magic_division.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/loop_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/is_known_at_compile_time.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/is_detected.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/integral_constant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/inner_product_dpp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/inner_product.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/ignore.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/get_shift.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/get_id.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/generic_memory_space_atomic.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/functional.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/flush_icache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/filter_tuple.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/f8_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/env.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/enable_if.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/e8m0.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dynamic_buffer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dtype_vector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/dtype_fp64.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/debug.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/data_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/container_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/container_element_picker.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/common_header.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/c_style_pointer_cast.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/blkgemmpipe_scheduler.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/array_multi_index.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/array.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_wave_read_first_lane.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_smfmac.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_inline_asm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_ck_fp8.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_buffer_addressing_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_buffer_addressing.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/utility/amd_address_space.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_ngchw_to_nhwgc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_fwd_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_weight_to_gemm_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_weight_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_conv_bwd_data_to_gemm_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_contraction_to_gemm_arraybase.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/operator_transform/transform_contraction_to_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/xdlops_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/wmma_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/smfmac_xdlops_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/warp/dpp_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r3_scatter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v7.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r1r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v6r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v5r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v4r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1_dequant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer_util.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_transfer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_tensor_slice_set.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_gemm_dlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/threadwise_contraction_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/thread/reduction_functions_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_welford_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_splitk_2nd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_splitk_1st.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_naive_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/normalization/gridwise_normalization_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_sparse_embeddings_forward_layernorm_builtins.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_sparse_embeddings_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_set_multiple_buffer_value.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_set_buffer_value.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_put_element_1d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm_bns.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_gemm_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r4r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_v2r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_splitk_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_skip_b_lds_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdlops_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_waveletmodel_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_layernorm_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_mx_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_blockscale_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_streamk_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_xdl_cshuffle_conv_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_waveletmodel.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_split_k_multiple_d_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_split_k_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_reduce_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v4_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_splitk_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dl_v1r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dl_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_gemm_bias_add_reduce_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_fpAintB_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_layernorm_welford_variance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_elementwise_1d_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batchnorm_forward_blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batchnorm_backward_blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_softmax_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_softmax_gemm_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_multiple_d_softmax_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_multiple_d_gemm_multiple_d_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_batched_gemm_gemm_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_threadwise_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_reduction_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_multiple_reduction_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gridwise_2d_multiple_reduction_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm/gridwise_welford_second_half_layernorm2d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/gemm_layernorm/gridwise_gemm_multiple_d_welford_first_half_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/block_to_ctile_map.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_second_half_multiblock_reduce_first_half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_second_half_batchnorm_forward_final_obsolete.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_welford_first_half.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_reduce_second_half_batchnorm_backward_final.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/grid/batchnorm_multiblock/gridwise_multiblock_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/unary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/quantization_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/combined_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/element/binary_element_wise_operation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/welford_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/tensor_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/tensor_layout.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/reduction_operator_mapping.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/matrix_padder.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/masking_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/split_k_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/split_k_arg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_splitk_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_sparse_embeddings_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_softmax_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_threadwise_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_reduce_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_put_element_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_pool3d_fwd_ndhwc_ndhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_pool2d_fwd_nhwc_nhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_permute_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_fwd_splitk_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_fwd_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_bwd_gamma_beta_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_normalization_bwd_data_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multiple_reduce_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multiple_reduce_multiblock.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_multi_query_attention_forward_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm_bns.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_gemm_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_max_pool_bwd_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_image_to_column_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_query_attention_forward_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl_splitk_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_softmax_gemm_permute_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_xdl_cshuffle_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_splitk_xdl_cshuffle_two_stage.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multi_abd_xdl_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_xdl_large_tensor_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_d_multiple_r.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_abd_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_dl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_fwd_dl_multiple_d_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_two_stage_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_explicit_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_weight_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_data_multiple_d_xdl_cshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_conv_bwd_data_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_grouped_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_waveletmodel_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_splitk_c_shuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_splitk_c_shuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_skip_b_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_layernorm_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_streamk_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_blockscale_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_b_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle_lds_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_layernorm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_gemm_bias_add_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_fpAintB_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_scale_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_normalization_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_elementwise_dynamic_vector_dims_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_convnd_bwd_data_nwc_kxc_nwk_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_convnd_bwd_data_nwc_kxc_nwk_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv3d_fwd_naive_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_bias_activation_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_fwd_xdl_c_shuffle_bias_activation_add_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_bwd_data_xdl_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_conv2d_backward_weight_xdl_c_shuffle_nhwc_kyxc_nhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_utils.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_contraction_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_column_to_image_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_cgemm_4gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_forward_impl_obsolete.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_forward_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batchnorm_backward_impl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_xdl_fpAintB_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_wmma_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_softmax_gemm_permute_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_reduce_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_gemm_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_dl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multi_d_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_gemm_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_e_permute_xdl.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_contraction_multiple_d_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_batched_contraction_multiple_d_wmma_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_avgpool3d_bwd_ndhwc_ndhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/device_avgpool2d_bwd_nhwc_nhwc.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/impl/codegen_device_grouped_conv_fwd_multiple_abd_xdl_cshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/gemm_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_splitk_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_reduce_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_put_element.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_pool_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_normalization_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_multiple_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_max_pool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_multi_abd_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_weight_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_conv_bwd_data_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_grouped_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_streamk_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_multiple_r.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_dequantB.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm_bias_e_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise_normalization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd_bias_activation_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd_bias_activation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_conv_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_contraction_multiple_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_cgemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_softmax_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_multiple_d_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm_e_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_batched_contraction_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/device_avgpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_forward_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_backward_weight_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/convolution_backward_data_specialization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/device/conv_tensor_rearrange_op.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r3_scatter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v7.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r1r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v6r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1_gather.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1_dequant.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_v4r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_gather_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/thread_group_tensor_slice_transfer_direct_load.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/reduction_functions_blockwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_welford.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_tensor_slice_transfer_v5r1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_xdlops_skip_b_lds.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_wmma.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_smfmac_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v5.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v4_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_mx_bpreshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_nbs_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_moe_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_mx_bpreshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_moe_blockscale_b_preshuffle_gufusion_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_blockscale_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_scale_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_mx_moe_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_gufusion_dequant_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_dequant_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_b_preshuffle_dequant_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_ab_scale_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_v1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmmaops.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_wmma_selector.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_mx_pipeline_xdlops_base.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dpp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dlops_v3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dlops_v2r2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_operation/gpu/block/blockwise_gemm_dl_v2r3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_space_filling_curve.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/tensor_adaptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/multi_index_transform_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/multi_index_transform.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor_description/cluster_descriptor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/tensor/static_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/stream_config.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/problem_transform -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/problem_transform/transform_forward_convolution3d_into_gemm_v4r4r4_ndhwc_kzyxc_ndhwk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/thread.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/ranges.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/numeric.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/literals.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/iterator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_tensor_generator.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_tensor.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/host_common_util.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/fill.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/device_memory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/convolution_parameter.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/convolution_host_tensor_descriptor_helper.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/conv_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/check_err.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/utility/algorithm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/stream_utility.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/kernel_launch.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/io.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/hip_check_error.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/flush_cache.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/host_utility/device_prop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/filesystem.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/config.h.in -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/README.md -- Up-to-date: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck -- Up-to-date: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose_3d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/transpose/device_transpose_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank4_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f32_f32_instance_rank3_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_type.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce4.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank4_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce3.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax/device_softmax_f16_f16_instance_rank3_reduce1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i8_i8_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_i8_i32_i8_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f32_f16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_f16_f16_f16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise_b16_f32_b16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_threadwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_f16_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add_b16_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_multiblock_atomic_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_impl_common.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i8_i8_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_i8_i32_i8_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f64_f64_f64_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f64_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f32_f32_f32_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f32_f16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_f16_f16_f16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_norm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_min.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_max.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_avg.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_amax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise_b16_f32_b16_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance_blockwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/reduce/device_reduce_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_forward_perlayer_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_forward_perchannel_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_bias_forward_perlayer_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/grouped_convolution_bias_forward_perchannel_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/quantization/gemm_quantization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/pool3d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/pool2d_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale/device_permute_scale_instances.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/permute_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/normalization_fwd_swish.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/normalization_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/max_pool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/layernorm_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/layernorm_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/groupnorm_bwd_gamma_beta.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/groupnorm_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_tile_loop_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_tile_loop.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_multi_abd_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_fixed_nk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm_bias.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm/device_grouped_gemm_xdl_splitk_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl_merged_groups.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl_large_tensor.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scaleadd_scaleadd_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scaleadd_ab.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_mem_intra_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_mem_inter_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_dynamic_op.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_convinvscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_comp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_clamp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_clamp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bias_clamp_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward_bias_clamp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_explicit_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_convolution_backward_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scaleadd_scaleadd_relu_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scaleadd_ab_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_outelementop_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_merged_groups_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_mem_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_large_tensor_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_dynamic_op_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_comp_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_binary_outelementop_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_wmma_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_fwd/device_grouped_conv_fwd_dl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_wmma_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_v3_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_two_stage_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_grouped_conv_bwd_weight_dl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_weight/device_exp_gemm_xdl_universal_km_kn_mn_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_scale_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_xdl_bilinear_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_transpose_xdl_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_xdl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_wmma.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal_batched.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_universal.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_streamk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_splitk.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_mx.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_multiply_wp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multiply_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_multi_abd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_dpp.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_dl.inc -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_blockscale_wp.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_silu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_relu_add_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_relu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_multiply.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add_add_fastgelu.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm_ab_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/elementwise_normalization.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v2_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_interwave_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_xdl_c_shuffle_fp8_fp8_fp8_mk_kn_mn_v1_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_gemm_mean_squaremean_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/device_elementwise_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/convolution_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/convolution_backward_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange/device_image_to_column_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange/device_column_to_image_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/conv_tensor_rearrange.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction_bilinear.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/contraction/device_contraction_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_softmax_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_multi_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_bias_softmax_gemm_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_bias_permute.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_b_scale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/avg_pool3d_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/avg_pool2d_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/device_operation_instance_factory.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/add_grouped_conv_bwd_wei_exp_device_operation_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/add_device_operation_instance.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/gpu/naive_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_sparse_embedding3_forward_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_softmax.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_reduce.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_pool_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_mx_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_mx_gemm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_mx_gemm1.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm2_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm2.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm1_blockscale.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_moe_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_maxpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_layernorm_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_image_to_column.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_groupnorm_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_groupnorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm_multiple_d.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm_layernorm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_fpAintB_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_elementwise.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd_bias_activation_add.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd_bias_activation.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_fwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_bwd_weight.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_conv_bwd_data.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_contraction.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_column_to_image.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_cgemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_infer.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_forward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batchnorm_backward.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_batched_gemm.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/reference_tensor_operation/cpu/reference_avgpool_bwd.hpp -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1.1.0 -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1 -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so.1.1.0" to "$ORIGIN/../lib:$ORIGIN/../lib/composable_kernel/lib" -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libutility.so -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelutilityTargets.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelutilityTargets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelConfig.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/cmake/composable_kernel/composable_kernelConfigVersion.cmake -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/version.h -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/config.h -- Installing: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composablekernel/LICENSE + cp -p -r include/ck_tile /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include + _target= + _symlinks=0 + fdupes -q -n -r -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr + read _file + test -z '' + _target=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + read _file + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp + test 0 = 1 + ln -f /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_f16_instance.hpp + read _file + test -z /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/include/ck/library/tensor_operation_instance/gpu/grouped_conv_bwd_data/device_grouped_conv_bwd_data_wmma_i8_instance.hpp + test -z '' + _target= + continue + read _file + rm -f /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composablekernel/LICENSE + /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 7.1.0-2.fc44 --unique-debug-suffix -7.1.0-2.fc44.x86_64 --unique-debug-src-base composable_kernel-7.1.0-2.fc44.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0 find-debuginfo: starting Extracting debug info from 5 files debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_other_operations.so.1.1.0: Unit type 4 unhandl/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_conv_operations.so.1.1.0: Unit type 4 unhandleedd debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_gemm_operations.so.1.1.0: Unit type 4 unhandled debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubnames debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unknown debugging section .debug_gnu_pubtypes debugedit: /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/lib64/libdevice_reduction_operations.so.1.1.0: Unit type 4 unhandled DWARF-compressing 5 files dwz: ./usr/lib64/libdevice_conv_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_gemm_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_other_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libdevice_reduction_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug: Unknown debugging section .debug_addr dwz: ./usr/lib64/libutility.so.1.1.0-7.1.0-2.fc44.x86_64.debug: Unknown debugging section .debug_addr dwz: Too few files for multifile optimization sepdebugcrcfix: Updated 0 CRC32s, 5 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/composable_kernel-7.1.0-2.fc44.x86_64 find-debuginfo: done + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + /usr/lib/rpm/redhat/brp-python-rpm-in-distinfo + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-det --brp -j4 /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT Scanned 141 directories and 1232 files, processed 0 inodes, 0 modified (0 replaced + 0 rewritten), 0 unsupported format, 0 errors + /usr/bin/linkdupes --brp /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr Scanned 140 directories and 1232 files, considered 1222 files, read 156 files, linked 8 files, 0 errors sum of sizes of linked files: 59247 bytes Reading /builddir/build/BUILD/composable_kernel-7.1.0-build/SPECPARTS/rpm-debuginfo.specpart Processing files: composable_kernel-7.1.0-2.fc44.x86_64 Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.yAlMp4 + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd composable_kernel-rocm-7.1.0 + DOCDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export DOCDIR + /usr/bin/mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + cp -pr /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/README.md /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/doc/composable_kernel + RPM_EC=0 ++ jobs -p + exit 0 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.tttKrV + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + cd composable_kernel-rocm-7.1.0 + LICENSEDIR=/builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + cp -pr /builddir/build/BUILD/composable_kernel-7.1.0-build/composable_kernel-rocm-7.1.0/LICENSE /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT/usr/share/licenses/composable_kernel + RPM_EC=0 ++ jobs -p + exit 0 Provides: composable_kernel = 7.1.0-2.fc44 composable_kernel(x86-64) = 7.1.0-2.fc44 libdevice_conv_operations.so.1()(64bit) libdevice_gemm_operations.so.1()(64bit) libdevice_other_operations.so.1()(64bit) libdevice_reduction_operations.so.1()(64bit) libutility.so.1()(64bit) Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libamdhip64.so.7()(64bit) libamdhip64.so.7(hip_4.2)(64bit) libamdhip64.so.7(hip_6.0)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.38)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.5)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.14)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.20)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.31)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) rtld(GNU_HASH) Processing files: composable_kernel-devel-7.1.0-2.fc44.x86_64 Provides: cmake(composable_kernel) = 1.1.0 composable_kernel-devel = 7.1.0-2.fc44 composable_kernel-devel(x86-64) = 7.1.0-2.fc44 composable_kernel-static = 7.1.0-2.fc44 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PartialHardlinkSets) <= 4.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: cmake-filesystem(x86-64) libdevice_conv_operations.so.1()(64bit) libdevice_gemm_operations.so.1()(64bit) libdevice_other_operations.so.1()(64bit) libdevice_reduction_operations.so.1()(64bit) libutility.so.1()(64bit) Processing files: composable_kernel-debugsource-7.1.0-2.fc44.x86_64 Provides: composable_kernel-debugsource = 7.1.0-2.fc44 composable_kernel-debugsource(x86-64) = 7.1.0-2.fc44 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: composable_kernel-debuginfo-7.1.0-2.fc44.x86_64 Provides: composable_kernel-debuginfo = 7.1.0-2.fc44 composable_kernel-debuginfo(x86-64) = 7.1.0-2.fc44 debuginfo(build-id) = 18aedccdca4f93936a1387e0a02b605eac068ef1 debuginfo(build-id) = 6d092f4ccb090a0379399813b29423cc72f0953c debuginfo(build-id) = c5d6a85b9fa4430953f8254f3dec60db6d44e5fe debuginfo(build-id) = c6ca93e78750a4450425b85a1aa1df800a7cde1d debuginfo(build-id) = f674f6a813e3a5f11ccf59f554f9f22ccc6773ff libdevice_conv_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug()(64bit) libdevice_gemm_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug()(64bit) libdevice_other_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug()(64bit) libdevice_reduction_operations.so.1.1.0-7.1.0-2.fc44.x86_64.debug()(64bit) libutility.so.1.1.0-7.1.0-2.fc44.x86_64.debug()(64bit) Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: composable_kernel-debugsource(x86-64) = 7.1.0-2.fc44 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/composable_kernel-7.1.0-build/BUILDROOT Wrote: /builddir/build/RPMS/composable_kernel-debugsource-7.1.0-2.fc44.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-devel-7.1.0-2.fc44.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-7.1.0-2.fc44.x86_64.rpm Wrote: /builddir/build/RPMS/composable_kernel-debuginfo-7.1.0-2.fc44.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.i1Y59E + umask 022 + cd /builddir/build/BUILD/composable_kernel-7.1.0-build + test -d /builddir/build/BUILD/composable_kernel-7.1.0-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/composable_kernel-7.1.0-build + rm -rf /builddir/build/BUILD/composable_kernel-7.1.0-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild composable_kernel-7.1.0-2.fc44.src.rpm Finish: build phase for composable_kernel-7.1.0-2.fc44.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-rawhide-x86_64-1763473467.053604/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names INFO: Done(/var/lib/copr-rpmbuild/results/composable_kernel-7.1.0-2.fc44.src.rpm) Config(child) 1684 minutes 28 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "composable_kernel", "epoch": null, "version": "7.1.0", "release": "2.fc44", "arch": "x86_64" }, { "name": "composable_kernel", "epoch": null, "version": "7.1.0", "release": "2.fc44", "arch": "src" }, { "name": "composable_kernel-debuginfo", "epoch": null, "version": "7.1.0", "release": "2.fc44", "arch": "x86_64" }, { "name": "composable_kernel-debugsource", "epoch": null, "version": "7.1.0", "release": "2.fc44", "arch": "x86_64" }, { "name": "composable_kernel-devel", "epoch": null, "version": "7.1.0", "release": "2.fc44", "arch": "x86_64" } ] } RPMResults finished