Warning: Permanently added '18.204.227.39' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/7914203-fedora-rawhide-x86_64 --chroot fedora-rawhide-x86_64 Version: 0.73 PID: 7381 Logging PID: 7382 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 7914203, 'buildroot_pkgs': [], 'chroot': 'fedora-rawhide-x86_64', 'enable_net': True, 'fedora_review': False, 'git_hash': '8fb011305b380c8bf47bd7a17cb4175c492bbebb', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/cutlass', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'cutlass', 'package_version': '3.5.0-20240411.3.cu12_6', 'project_dirname': 'ML', 'project_name': 'ML', 'project_owner': 'rezso', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/rezso/ML/fedora-rawhide-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}, {'baseurl': 'https://download.copr.fedorainfracloud.org/results/rezso/CUDA/fedora-rawhide-x86_64/', 'id': 'copr_rezso_CUDA', 'name': 'Additional repo copr_rezso_CUDA'}, {'baseurl': 'http://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64', 'id': 'http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64', 'name': 'Additional repo http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64'}, {'baseurl': 'http://developer.download.nvidia.com/compute/cuda/repos/rhel9/sbsa', 'id': 'http_developer_download_nvidia_com_compute_cuda_repos_rhel9_sbsa', 'name': 'Additional repo http_developer_download_nvidia_com_compute_cuda_repos_rhel9_sbsa'}], 'sandbox': 'rezso/ML--rezso', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'submitter': 'rezso', 'tags': [], 'task_id': '7914203-fedora-rawhide-x86_64', 'timeout': 172800, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/cutlass /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/cutlass', '/var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass'... Running: git checkout 8fb011305b380c8bf47bd7a17cb4175c492bbebb -- cmd: ['git', 'checkout', '8fb011305b380c8bf47bd7a17cb4175c492bbebb', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass rc: 0 stdout: stderr: Note: switching to '8fb011305b380c8bf47bd7a17cb4175c492bbebb'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at 8fb0113 automatic import of cutlass Running: copr-distgit-client sources cmd: ['copr-distgit-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources /usr/bin/tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=172800): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass/cutlass.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1723840744.707232 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 5.6 starting (python version = 3.12.1, NVR = mock-5.6-1.fc39), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass/cutlass.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1723840744.707232 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass/cutlass.spec) Config(fedora-rawhide-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 5.6 INFO: Mock Version: 5.6 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1723840744.707232/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using bootstrap image: registry.fedoraproject.org/fedora:rawhide INFO: Pulling image: registry.fedoraproject.org/fedora:rawhide INFO: Copy content of container registry.fedoraproject.org/fedora:rawhide to /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1723840744.707232/root INFO: Checking that registry.fedoraproject.org/fedora:rawhide image matches host's architecture INFO: mounting registry.fedoraproject.org/fedora:rawhide with podman image mount INFO: image registry.fedoraproject.org/fedora:rawhide as /var/lib/containers/storage/overlay/ae1e8f90948c2eff7e75a82b94bfe3e4c4b213016071a3db97ed80abc9ee5976/merged INFO: umounting image registry.fedoraproject.org/fedora:rawhide (/var/lib/containers/storage/overlay/ae1e8f90948c2eff7e75a82b94bfe3e4c4b213016071a3db97ed80abc9ee5976/merged) with podman image umount INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-1723840744.707232/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.19.92-6.fc41.x86_64 rpm-sequoia-1.7.0-2.fc41.x86_64 dnf5-5.2.5.0-2.fc41.x86_64 dnf5-plugins-5.2.5.0-2.fc41.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: fedora 100% | 38.5 MiB/s | 21.6 MiB | 00m01s Copr repository 100% | 3.2 MiB/s | 116.3 KiB | 00m00s Additional repo copr_rezso_CUDA 100% | 694.5 KiB/s | 47.2 KiB | 00m00s Additional repo http_developer_downloa 100% | 23.0 MiB/s | 377.2 KiB | 00m00s Additional repo http_developer_downloa 100% | 21.5 MiB/s | 286.0 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.2.32-1.fc41 fedora 8.2 MiB bzip2 x86_64 1.0.8-19.fc41 fedora 95.7 KiB coreutils x86_64 9.5-7.fc41 fedora 5.6 MiB cpio x86_64 2.15-2.fc41 fedora 1.1 MiB diffutils x86_64 3.10-8.fc41 fedora 1.6 MiB fedora-release-common noarch 42-0.1 fedora 19.4 KiB findutils x86_64 1:4.10.0-4.fc41 fedora 1.8 MiB gawk x86_64 5.3.0-4.fc41 fedora 1.7 MiB glibc-minimal-langpack x86_64 2.40-3.fc41 fedora 0.0 B grep x86_64 3.11-9.fc41 fedora 1.0 MiB gzip x86_64 1.13-2.fc41 fedora 389.0 KiB info x86_64 7.1-3.fc41 fedora 361.8 KiB patch x86_64 2.7.6-25.fc41 fedora 266.7 KiB redhat-rpm-config noarch 293-1.fc41 fedora 183.5 KiB rpm-build x86_64 4.19.92-6.fc41 fedora 194.1 KiB sed x86_64 4.9-3.fc41 fedora 861.5 KiB shadow-utils x86_64 2:4.15.1-9.fc41 fedora 4.1 MiB tar x86_64 2:1.35-4.fc41 fedora 2.9 MiB unzip x86_64 6.0-64.fc41 fedora 386.8 KiB util-linux x86_64 2.40.2-4.fc41 fedora 3.7 MiB which x86_64 2.21-42.fc41 fedora 80.2 KiB xz x86_64 1:5.6.2-2.fc41 fedora 1.2 MiB Installing dependencies: add-determinism x86_64 0.3.6-1.fc41 fedora 2.2 MiB alternatives x86_64 1.30-1.fc41 fedora 66.3 KiB ansible-srpm-macros noarch 1-16.fc41 fedora 35.7 KiB audit-libs x86_64 4.0.2-1.fc41 fedora 331.3 KiB authselect x86_64 1.5.0-7.fc41 fedora 153.5 KiB authselect-libs x86_64 1.5.0-7.fc41 fedora 818.3 KiB basesystem noarch 11-21.fc41 fedora 0.0 B binutils x86_64 2.42.90-1.fc41 fedora 27.4 MiB build-reproducibility-srpm-macros noarch 0.3.6-1.fc41 fedora 735.0 B bzip2-libs x86_64 1.0.8-19.fc41 fedora 80.7 KiB ca-certificates noarch 2024.2.68_v8.0.302-3.fc41 fedora 2.3 MiB coreutils-common x86_64 9.5-7.fc41 fedora 11.2 MiB cracklib x86_64 2.9.11-6.fc41 fedora 238.9 KiB crypto-policies noarch 20240807-1.git5795660.fc41 fedora 136.8 KiB curl x86_64 8.9.1-2.fc41 fedora 796.2 KiB cyrus-sasl-lib x86_64 2.1.28-27.fc41 fedora 2.3 MiB debugedit x86_64 5.0-17.fc41 fedora 199.3 KiB dwz x86_64 0.15-7.fc41 fedora 290.9 KiB ed x86_64 1.20.2-2.fc41 fedora 146.9 KiB efi-srpm-macros noarch 5-12.fc41 fedora 40.1 KiB elfutils x86_64 0.191-8.fc41 fedora 2.6 MiB elfutils-debuginfod-client x86_64 0.191-8.fc41 fedora 64.9 KiB elfutils-default-yama-scope noarch 0.191-8.fc41 fedora 1.8 KiB elfutils-libelf x86_64 0.191-8.fc41 fedora 1.2 MiB elfutils-libs x86_64 0.191-8.fc41 fedora 646.2 KiB fedora-gpg-keys noarch 42-0.1 fedora 126.4 KiB fedora-release noarch 42-0.1 fedora 0.0 B fedora-release-identity-basic noarch 42-0.1 fedora 694.0 B fedora-repos noarch 42-0.1 fedora 4.9 KiB fedora-repos-rawhide noarch 42-0.1 fedora 2.2 KiB file x86_64 5.45-7.fc41 fedora 103.5 KiB file-libs x86_64 5.45-7.fc41 fedora 9.9 MiB filesystem x86_64 3.18-23.fc41 fedora 106.0 B fonts-srpm-macros noarch 1:2.0.5-17.fc41 fedora 55.8 KiB forge-srpm-macros noarch 0.3.2-1.fc41 fedora 39.0 KiB fpc-srpm-macros noarch 1.3-13.fc41 fedora 144.0 B gdb-minimal x86_64 15.1-1.fc41 fedora 13.0 MiB gdbm x86_64 1:1.23-7.fc41 fedora 460.9 KiB gdbm-libs x86_64 1:1.23-7.fc41 fedora 121.9 KiB ghc-srpm-macros noarch 1.9.1-2.fc41 fedora 747.0 B glibc x86_64 2.40-3.fc41 fedora 6.7 MiB glibc-common x86_64 2.40-3.fc41 fedora 1.0 MiB glibc-gconv-extra x86_64 2.40-3.fc41 fedora 8.0 MiB gmp x86_64 1:6.3.0-2.fc41 fedora 811.4 KiB gnat-srpm-macros noarch 6-6.fc41 fedora 1.0 KiB go-srpm-macros noarch 3.6.0-3.fc41 fedora 60.8 KiB jansson x86_64 2.13.1-10.fc41 fedora 88.3 KiB kernel-srpm-macros noarch 1.0-24.fc41 fedora 1.9 KiB keyutils-libs x86_64 1.6.3-4.fc41 fedora 54.4 KiB krb5-libs x86_64 1.21.3-2.fc41 fedora 2.3 MiB libacl x86_64 2.3.2-2.fc41 fedora 40.0 KiB libarchive x86_64 3.7.4-3.fc41 fedora 922.6 KiB libattr x86_64 2.5.2-4.fc41 fedora 28.5 KiB libblkid x86_64 2.40.2-4.fc41 fedora 258.5 KiB libbrotli x86_64 1.1.0-5.fc41 fedora 837.6 KiB libcap x86_64 2.70-4.fc41 fedora 220.2 KiB libcap-ng x86_64 0.8.5-3.fc41 fedora 69.2 KiB libcom_err x86_64 1.47.1-3.fc41 fedora 67.2 KiB libcurl x86_64 8.9.1-2.fc41 fedora 818.1 KiB libeconf x86_64 0.6.2-3.fc41 fedora 58.0 KiB libevent x86_64 2.1.12-14.fc41 fedora 895.7 KiB libfdisk x86_64 2.40.2-4.fc41 fedora 362.9 KiB libffi x86_64 3.4.6-2.fc41 fedora 86.4 KiB libgcc x86_64 14.1.1-7.fc41 fedora 274.6 KiB libgomp x86_64 14.1.1-7.fc41 fedora 523.4 KiB libidn2 x86_64 2.3.7-2.fc41 fedora 329.1 KiB libmount x86_64 2.40.2-4.fc41 fedora 351.8 KiB libnghttp2 x86_64 1.62.1-2.fc41 fedora 166.1 KiB libnsl2 x86_64 2.0.1-2.fc41 fedora 57.9 KiB libpkgconf x86_64 2.1.1-2.fc41 fedora 74.2 KiB libpsl x86_64 0.21.5-4.fc41 fedora 80.5 KiB libpwquality x86_64 1.4.5-11.fc41 fedora 417.8 KiB libselinux x86_64 3.7-5.fc41 fedora 181.0 KiB libsemanage x86_64 3.7-2.fc41 fedora 293.5 KiB libsepol x86_64 3.7-2.fc41 fedora 817.8 KiB libsmartcols x86_64 2.40.2-4.fc41 fedora 180.4 KiB libssh x86_64 0.10.6-8.fc41 fedora 513.3 KiB libssh-config noarch 0.10.6-8.fc41 fedora 277.0 B libstdc++ x86_64 14.1.1-7.fc41 fedora 2.8 MiB libtasn1 x86_64 4.19.0-9.fc41 fedora 175.7 KiB libtirpc x86_64 1.3.5-0.fc41 fedora 202.7 KiB libtool-ltdl x86_64 2.4.7-12.fc41 fedora 66.2 KiB libunistring x86_64 1.1-8.fc41 fedora 1.7 MiB libutempter x86_64 1.2.1-15.fc41 fedora 57.7 KiB libuuid x86_64 2.40.2-4.fc41 fedora 37.5 KiB libverto x86_64 0.3.2-9.fc41 fedora 29.5 KiB libxcrypt x86_64 4.4.36-7.fc41 fedora 266.8 KiB libxml2 x86_64 2.12.8-2.fc41 fedora 1.7 MiB libzstd x86_64 1.5.6-2.fc41 fedora 795.9 KiB lua-libs x86_64 5.4.6-6.fc41 fedora 285.0 KiB lua-srpm-macros noarch 1-14.fc41 fedora 1.3 KiB lz4-libs x86_64 1.10.0-1.fc41 fedora 145.5 KiB mpfr x86_64 4.2.1-5.fc41 fedora 832.1 KiB ncurses-base noarch 6.5-2.20240629.fc41 fedora 326.3 KiB ncurses-libs x86_64 6.5-2.20240629.fc41 fedora 975.2 KiB ocaml-srpm-macros noarch 10-3.fc41 fedora 1.9 KiB openblas-srpm-macros noarch 2-18.fc41 fedora 112.0 B openldap x86_64 2.6.8-5.fc41 fedora 644.2 KiB openssl-libs x86_64 1:3.2.2-5.fc41 fedora 7.8 MiB p11-kit x86_64 0.25.5-3.fc41 fedora 2.2 MiB p11-kit-trust x86_64 0.25.5-3.fc41 fedora 391.4 KiB package-notes-srpm-macros noarch 0.5-12.fc41 fedora 1.6 KiB pam x86_64 1.6.1-5.fc41 fedora 1.8 MiB pam-libs x86_64 1.6.1-5.fc41 fedora 139.0 KiB pcre2 x86_64 10.44-1.fc41.1 fedora 653.5 KiB pcre2-syntax noarch 10.44-1.fc41.1 fedora 251.6 KiB perl-srpm-macros noarch 1-56.fc41 fedora 861.0 B pkgconf x86_64 2.1.1-2.fc41 fedora 82.8 KiB pkgconf-m4 noarch 2.1.1-2.fc41 fedora 13.9 KiB pkgconf-pkg-config x86_64 2.1.1-2.fc41 fedora 989.0 B popt x86_64 1.19-7.fc41 fedora 136.9 KiB publicsuffix-list-dafsa noarch 20240107-4.fc41 fedora 67.5 KiB pyproject-srpm-macros noarch 1.14.0-1.fc41 fedora 1.9 KiB python-srpm-macros noarch 3.13-3.fc41 fedora 51.0 KiB qt5-srpm-macros noarch 5.15.14-3.fc41 fedora 500.0 B qt6-srpm-macros noarch 6.7.2-3.fc41 fedora 456.0 B readline x86_64 8.2-10.fc41 fedora 493.2 KiB rpm x86_64 4.19.92-6.fc41 fedora 3.1 MiB rpm-build-libs x86_64 4.19.92-6.fc41 fedora 206.7 KiB rpm-libs x86_64 4.19.92-6.fc41 fedora 721.9 KiB rpm-sequoia x86_64 1.7.0-2.fc41 fedora 2.4 MiB rust-srpm-macros noarch 26.3-1.fc41 fedora 4.8 KiB setup noarch 2.15.0-5.fc41 fedora 720.7 KiB sqlite-libs x86_64 3.46.0-4.fc41 fedora 1.4 MiB systemd-libs x86_64 256.4-1.fc41 fedora 2.0 MiB util-linux-core x86_64 2.40.2-4.fc41 fedora 1.5 MiB xxhash-libs x86_64 0.8.2-3.fc41 fedora 88.5 KiB xz-libs x86_64 1:5.6.2-2.fc41 fedora 214.4 KiB zig-srpm-macros noarch 1-3.fc41 fedora 1.1 KiB zip x86_64 3.0-41.fc41 fedora 703.2 KiB zlib-ng-compat x86_64 2.1.7-2.fc41 fedora 134.0 KiB zstd x86_64 1.5.6-2.fc41 fedora 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 154 packages Total size of inbound packages is 53 MiB. Need to download 53 MiB. After this operation 180 MiB will be used (install 180 MiB, remove 0 B). [ 1/154] bzip2-0:1.0.8-19.fc41.x86_64 100% | 3.7 MiB/s | 52.5 KiB | 00m00s [ 2/154] cpio-0:2.15-2.fc41.x86_64 100% | 57.0 MiB/s | 291.8 KiB | 00m00s [ 3/154] bash-0:5.2.32-1.fc41.x86_64 100% | 82.1 MiB/s | 1.8 MiB | 00m00s [ 4/154] diffutils-0:3.10-8.fc41.x86_6 100% | 132.0 MiB/s | 405.4 KiB | 00m00s [ 5/154] fedora-release-common-0:42-0. 100% | 22.4 MiB/s | 22.9 KiB | 00m00s [ 6/154] findutils-1:4.10.0-4.fc41.x86 100% | 267.8 MiB/s | 548.6 KiB | 00m00s [ 7/154] grep-0:3.11-9.fc41.x86_64 100% | 146.4 MiB/s | 299.8 KiB | 00m00s [ 8/154] glibc-minimal-langpack-0:2.40 100% | 24.2 MiB/s | 124.0 KiB | 00m00s [ 9/154] coreutils-0:9.5-7.fc41.x86_64 100% | 36.7 MiB/s | 1.1 MiB | 00m00s [ 10/154] gzip-0:1.13-2.fc41.x86_64 100% | 55.4 MiB/s | 170.2 KiB | 00m00s [ 11/154] redhat-rpm-config-0:293-1.fc4 100% | 26.7 MiB/s | 82.0 KiB | 00m00s [ 12/154] patch-0:2.7.6-25.fc41.x86_64 100% | 21.3 MiB/s | 131.0 KiB | 00m00s [ 13/154] info-0:7.1-3.fc41.x86_64 100% | 22.3 MiB/s | 182.5 KiB | 00m00s [ 14/154] sed-0:4.9-3.fc41.x86_64 100% | 155.1 MiB/s | 317.7 KiB | 00m00s [ 15/154] rpm-build-0:4.19.92-6.fc41.x8 100% | 20.4 MiB/s | 83.4 KiB | 00m00s [ 16/154] shadow-utils-2:4.15.1-9.fc41. 100% | 188.9 MiB/s | 1.3 MiB | 00m00s [ 17/154] tar-2:1.35-4.fc41.x86_64 100% | 105.1 MiB/s | 860.7 KiB | 00m00s [ 18/154] unzip-0:6.0-64.fc41.x86_64 100% | 22.6 MiB/s | 184.9 KiB | 00m00s [ 19/154] which-0:2.21-42.fc41.x86_64 100% | 13.5 MiB/s | 41.6 KiB | 00m00s [ 20/154] xz-1:5.6.2-2.fc41.x86_64 100% | 230.2 MiB/s | 471.5 KiB | 00m00s [ 21/154] gawk-0:5.3.0-4.fc41.x86_64 100% | 119.0 MiB/s | 1.1 MiB | 00m00s [ 22/154] util-linux-0:2.40.2-4.fc41.x8 100% | 120.5 MiB/s | 1.2 MiB | 00m00s [ 23/154] filesystem-0:3.18-23.fc41.x86 100% | 98.8 MiB/s | 1.1 MiB | 00m00s [ 24/154] glibc-0:2.40-3.fc41.x86_64 100% | 203.4 MiB/s | 2.2 MiB | 00m00s [ 25/154] ncurses-libs-0:6.5-2.20240629 100% | 32.6 MiB/s | 334.0 KiB | 00m00s [ 26/154] bzip2-libs-0:1.0.8-19.fc41.x8 100% | 5.0 MiB/s | 41.1 KiB | 00m00s [ 27/154] libacl-0:2.3.2-2.fc41.x86_64 100% | 3.4 MiB/s | 24.5 KiB | 00m00s [ 28/154] libattr-0:2.5.2-4.fc41.x86_64 100% | 2.5 MiB/s | 18.2 KiB | 00m00s [ 29/154] coreutils-common-0:9.5-7.fc41 100% | 176.7 MiB/s | 2.1 MiB | 00m00s [ 30/154] libcap-0:2.70-4.fc41.x86_64 100% | 21.2 MiB/s | 86.7 KiB | 00m00s [ 31/154] fedora-repos-0:42-0.1.noarch 100% | 1.5 MiB/s | 9.2 KiB | 00m00s [ 32/154] openssl-libs-1:3.2.2-5.fc41.x 100% | 231.0 MiB/s | 2.3 MiB | 00m00s [ 33/154] libselinux-0:3.7-5.fc41.x86_6 100% | 6.1 MiB/s | 87.8 KiB | 00m00s [ 34/154] glibc-common-0:2.40-3.fc41.x8 100% | 100.7 MiB/s | 412.4 KiB | 00m00s [ 35/154] pcre2-0:10.44-1.fc41.1.x86_64 100% | 118.7 MiB/s | 243.1 KiB | 00m00s [ 36/154] ansible-srpm-macros-0:1-16.fc 100% | 6.8 MiB/s | 20.8 KiB | 00m00s [ 37/154] ed-0:1.20.2-2.fc41.x86_64 100% | 16.0 MiB/s | 81.8 KiB | 00m00s [ 38/154] build-reproducibility-srpm-ma 100% | 757.4 KiB/s | 10.6 KiB | 00m00s [ 39/154] efi-srpm-macros-0:5-12.fc41.n 100% | 2.0 MiB/s | 22.4 KiB | 00m00s [ 40/154] dwz-0:0.15-7.fc41.x86_64 100% | 7.9 MiB/s | 138.0 KiB | 00m00s [ 41/154] file-0:5.45-7.fc41.x86_64 100% | 6.9 MiB/s | 49.1 KiB | 00m00s [ 42/154] forge-srpm-macros-0:0.3.2-1.f 100% | 9.6 MiB/s | 19.7 KiB | 00m00s [ 43/154] fonts-srpm-macros-1:2.0.5-17. 100% | 3.8 MiB/s | 27.0 KiB | 00m00s [ 44/154] fpc-srpm-macros-0:1.3-13.fc41 100% | 3.9 MiB/s | 8.0 KiB | 00m00s [ 45/154] gnat-srpm-macros-0:6-6.fc41.n 100% | 4.4 MiB/s | 9.0 KiB | 00m00s [ 46/154] ghc-srpm-macros-0:1.9.1-2.fc4 100% | 4.4 MiB/s | 9.1 KiB | 00m00s [ 47/154] kernel-srpm-macros-0:1.0-24.f 100% | 9.6 MiB/s | 9.9 KiB | 00m00s [ 48/154] lua-srpm-macros-0:1-14.fc41.n 100% | 8.7 MiB/s | 8.9 KiB | 00m00s [ 49/154] ocaml-srpm-macros-0:10-3.fc41 100% | 9.0 MiB/s | 9.2 KiB | 00m00s [ 50/154] openblas-srpm-macros-0:2-18.f 100% | 7.5 MiB/s | 7.7 KiB | 00m00s [ 51/154] go-srpm-macros-0:3.6.0-3.fc41 100% | 4.5 MiB/s | 28.0 KiB | 00m00s [ 52/154] package-notes-srpm-macros-0:0 100% | 4.8 MiB/s | 9.8 KiB | 00m00s [ 53/154] perl-srpm-macros-0:1-56.fc41. 100% | 4.2 MiB/s | 8.5 KiB | 00m00s [ 54/154] qt5-srpm-macros-0:5.15.14-3.f 100% | 8.8 MiB/s | 9.0 KiB | 00m00s [ 55/154] pyproject-srpm-macros-0:1.14. 100% | 6.6 MiB/s | 13.4 KiB | 00m00s [ 56/154] python-srpm-macros-0:3.13-3.f 100% | 5.8 MiB/s | 23.7 KiB | 00m00s [ 57/154] qt6-srpm-macros-0:6.7.2-3.fc4 100% | 4.5 MiB/s | 9.1 KiB | 00m00s [ 58/154] zig-srpm-macros-0:1-3.fc41.no 100% | 4.0 MiB/s | 8.1 KiB | 00m00s [ 59/154] rust-srpm-macros-0:26.3-1.fc4 100% | 6.1 MiB/s | 12.5 KiB | 00m00s [ 60/154] zip-0:3.0-41.fc41.x86_64 100% | 129.3 MiB/s | 264.8 KiB | 00m00s [ 61/154] elfutils-0:0.191-8.fc41.x86_6 100% | 172.3 MiB/s | 529.4 KiB | 00m00s [ 62/154] debugedit-0:5.0-17.fc41.x86_6 100% | 13.0 MiB/s | 80.1 KiB | 00m00s [ 63/154] rpm-0:4.19.92-6.fc41.x86_64 100% | 45.0 MiB/s | 552.4 KiB | 00m00s [ 64/154] elfutils-libelf-0:0.191-8.fc4 100% | 101.4 MiB/s | 207.6 KiB | 00m00s [ 65/154] libarchive-0:3.7.4-3.fc41.x86 100% | 133.2 MiB/s | 409.2 KiB | 00m00s [ 66/154] popt-0:1.19-7.fc41.x86_64 100% | 32.2 MiB/s | 65.9 KiB | 00m00s [ 67/154] readline-0:8.2-10.fc41.x86_64 100% | 104.1 MiB/s | 213.2 KiB | 00m00s [ 68/154] zstd-0:1.5.6-2.fc41.x86_64 100% | 235.1 MiB/s | 481.5 KiB | 00m00s [ 69/154] rpm-build-libs-0:4.19.92-6.fc 100% | 24.3 MiB/s | 99.4 KiB | 00m00s [ 70/154] libeconf-0:0.6.2-3.fc41.x86_6 100% | 31.4 MiB/s | 32.2 KiB | 00m00s [ 71/154] rpm-libs-0:4.19.92-6.fc41.x86 100% | 50.4 MiB/s | 309.5 KiB | 00m00s [ 72/154] libxcrypt-0:4.4.36-7.fc41.x86 100% | 57.9 MiB/s | 118.5 KiB | 00m00s [ 73/154] libsemanage-0:3.7-2.fc41.x86_ 100% | 28.4 MiB/s | 116.3 KiB | 00m00s [ 74/154] pam-libs-0:1.6.1-5.fc41.x86_6 100% | 56.0 MiB/s | 57.3 KiB | 00m00s [ 75/154] setup-0:2.15.0-5.fc41.noarch 100% | 150.8 MiB/s | 154.4 KiB | 00m00s [ 76/154] xz-libs-1:5.6.2-2.fc41.x86_64 100% | 109.1 MiB/s | 111.8 KiB | 00m00s [ 77/154] audit-libs-0:4.0.2-1.fc41.x86 100% | 13.7 MiB/s | 126.2 KiB | 00m00s [ 78/154] mpfr-0:4.2.1-5.fc41.x86_64 100% | 169.1 MiB/s | 346.3 KiB | 00m00s [ 79/154] libblkid-0:2.40.2-4.fc41.x86_ 100% | 60.9 MiB/s | 124.7 KiB | 00m00s [ 80/154] libfdisk-0:2.40.2-4.fc41.x86_ 100% | 78.0 MiB/s | 159.8 KiB | 00m00s [ 81/154] libcap-ng-0:0.8.5-3.fc41.x86_ 100% | 7.9 MiB/s | 32.6 KiB | 00m00s [ 82/154] libmount-0:2.40.2-4.fc41.x86_ 100% | 50.6 MiB/s | 155.5 KiB | 00m00s [ 83/154] libsmartcols-0:2.40.2-4.fc41. 100% | 40.9 MiB/s | 83.7 KiB | 00m00s [ 84/154] libutempter-0:1.2.1-15.fc41.x 100% | 26.0 MiB/s | 26.6 KiB | 00m00s [ 85/154] libuuid-0:2.40.2-4.fc41.x86_6 100% | 28.4 MiB/s | 29.1 KiB | 00m00s [ 86/154] systemd-libs-0:256.4-1.fc41.x 100% | 238.0 MiB/s | 731.0 KiB | 00m00s [ 87/154] util-linux-core-0:2.40.2-4.fc 100% | 131.2 MiB/s | 537.2 KiB | 00m00s [ 88/154] zlib-ng-compat-0:2.1.7-2.fc41 100% | 19.0 MiB/s | 77.8 KiB | 00m00s [ 89/154] glibc-gconv-extra-0:2.40-3.fc 100% | 282.1 MiB/s | 1.7 MiB | 00m00s [ 90/154] basesystem-0:11-21.fc41.noarc 100% | 1.2 MiB/s | 7.4 KiB | 00m00s [ 91/154] libgcc-0:14.1.1-7.fc41.x86_64 100% | 25.5 MiB/s | 130.4 KiB | 00m00s [ 92/154] ncurses-base-0:6.5-2.20240629 100% | 43.1 MiB/s | 88.4 KiB | 00m00s [ 93/154] ca-certificates-0:2024.2.68_v 100% | 212.0 MiB/s | 868.2 KiB | 00m00s [ 94/154] crypto-policies-0:20240807-1. 100% | 31.0 MiB/s | 95.2 KiB | 00m00s [ 95/154] fedora-gpg-keys-0:42-0.1.noar 100% | 130.5 MiB/s | 133.6 KiB | 00m00s [ 96/154] fedora-repos-rawhide-0:42-0.1 100% | 8.5 MiB/s | 8.7 KiB | 00m00s [ 97/154] pcre2-syntax-0:10.44-1.fc41.1 100% | 146.4 MiB/s | 149.9 KiB | 00m00s [ 98/154] libsepol-0:3.7-2.fc41.x86_64 100% | 30.4 MiB/s | 342.2 KiB | 00m00s [ 99/154] curl-0:8.9.1-2.fc41.x86_64 100% | 102.6 MiB/s | 315.1 KiB | 00m00s [100/154] elfutils-libs-0:0.191-8.fc41. 100% | 125.7 MiB/s | 257.5 KiB | 00m00s [101/154] elfutils-debuginfod-client-0: 100% | 36.4 MiB/s | 37.2 KiB | 00m00s [102/154] libstdc++-0:14.1.1-7.fc41.x86 100% | 216.1 MiB/s | 885.1 KiB | 00m00s [103/154] libzstd-0:1.5.6-2.fc41.x86_64 100% | 151.5 MiB/s | 310.3 KiB | 00m00s [104/154] libxml2-0:2.12.8-2.fc41.x86_6 100% | 167.8 MiB/s | 687.3 KiB | 00m00s [105/154] add-determinism-0:0.3.6-1.fc4 100% | 33.3 MiB/s | 851.8 KiB | 00m00s [106/154] lz4-libs-0:1.10.0-1.fc41.x86_ 100% | 69.0 MiB/s | 70.7 KiB | 00m00s [107/154] libgomp-0:14.1.1-7.fc41.x86_6 100% | 171.4 MiB/s | 351.1 KiB | 00m00s [108/154] lua-libs-0:5.4.6-6.fc41.x86_6 100% | 64.5 MiB/s | 132.0 KiB | 00m00s [109/154] rpm-sequoia-0:1.7.0-2.fc41.x8 100% | 217.9 MiB/s | 892.5 KiB | 00m00s [110/154] sqlite-libs-0:3.46.0-4.fc41.x 100% | 139.1 MiB/s | 712.4 KiB | 00m00s [111/154] elfutils-default-yama-scope-0 100% | 6.0 MiB/s | 12.3 KiB | 00m00s [112/154] authselect-libs-0:1.5.0-7.fc4 100% | 106.5 MiB/s | 218.1 KiB | 00m00s [113/154] pam-0:1.6.1-5.fc41.x86_64 100% | 135.2 MiB/s | 554.0 KiB | 00m00s [114/154] file-libs-0:5.45-7.fc41.x86_6 100% | 19.6 MiB/s | 762.0 KiB | 00m00s [115/154] authselect-0:1.5.0-7.fc41.x86 100% | 47.4 MiB/s | 145.7 KiB | 00m00s [116/154] gdbm-libs-1:1.23-7.fc41.x86_6 100% | 55.0 MiB/s | 56.3 KiB | 00m00s [117/154] libnsl2-0:2.0.1-2.fc41.x86_64 100% | 28.9 MiB/s | 29.6 KiB | 00m00s [118/154] libpwquality-0:1.4.5-11.fc41. 100% | 58.1 MiB/s | 119.1 KiB | 00m00s [119/154] libtirpc-0:1.3.5-0.fc41.x86_6 100% | 46.0 MiB/s | 94.2 KiB | 00m00s [120/154] cracklib-0:2.9.11-6.fc41.x86_ 100% | 44.9 MiB/s | 92.0 KiB | 00m00s [121/154] krb5-libs-0:1.21.3-2.fc41.x86 100% | 184.9 MiB/s | 757.4 KiB | 00m00s [122/154] libcom_err-0:1.47.1-3.fc41.x8 100% | 8.6 MiB/s | 26.4 KiB | 00m00s [123/154] keyutils-libs-0:1.6.3-4.fc41. 100% | 7.7 MiB/s | 31.6 KiB | 00m00s [124/154] libverto-0:0.3.2-9.fc41.x86_6 100% | 6.7 MiB/s | 20.7 KiB | 00m00s [125/154] alternatives-0:1.30-1.fc41.x8 100% | 5.2 MiB/s | 42.5 KiB | 00m00s [126/154] jansson-0:2.13.1-10.fc41.x86_ 100% | 2.2 MiB/s | 44.4 KiB | 00m00s [127/154] pkgconf-pkg-config-0:2.1.1-2. 100% | 625.8 KiB/s | 10.0 KiB | 00m00s [128/154] binutils-0:2.42.90-1.fc41.x86 100% | 189.1 MiB/s | 6.4 MiB | 00m00s [129/154] pkgconf-0:2.1.1-2.fc41.x86_64 100% | 3.9 MiB/s | 43.9 KiB | 00m00s [130/154] pkgconf-m4-0:2.1.1-2.fc41.noa 100% | 1.7 MiB/s | 14.2 KiB | 00m00s [131/154] libpkgconf-0:2.1.1-2.fc41.x86 100% | 18.6 MiB/s | 38.2 KiB | 00m00s [132/154] gdbm-1:1.23-7.fc41.x86_64 100% | 74.1 MiB/s | 151.8 KiB | 00m00s [133/154] p11-kit-0:0.25.5-3.fc41.x86_6 100% | 119.9 MiB/s | 490.9 KiB | 00m00s [134/154] libffi-0:3.4.6-2.fc41.x86_64 100% | 9.8 MiB/s | 40.2 KiB | 00m00s [135/154] libtasn1-0:4.19.0-9.fc41.x86_ 100% | 18.1 MiB/s | 74.2 KiB | 00m00s [136/154] p11-kit-trust-0:0.25.5-3.fc41 100% | 25.8 MiB/s | 132.1 KiB | 00m00s [137/154] fedora-release-0:42-0.1.noarc 100% | 1.7 MiB/s | 12.1 KiB | 00m00s [138/154] xxhash-libs-0:0.8.2-3.fc41.x8 100% | 9.1 MiB/s | 37.1 KiB | 00m00s [139/154] gmp-1:6.3.0-2.fc41.x86_64 100% | 103.5 MiB/s | 318.0 KiB | 00m00s [140/154] fedora-release-identity-basic 100% | 3.2 MiB/s | 12.9 KiB | 00m00s [141/154] libbrotli-0:1.1.0-5.fc41.x86_ 100% | 166.2 MiB/s | 340.5 KiB | 00m00s [142/154] libidn2-0:2.3.7-2.fc41.x86_64 100% | 57.8 MiB/s | 118.4 KiB | 00m00s [143/154] libcurl-0:8.9.1-2.fc41.x86_64 100% | 50.5 MiB/s | 361.9 KiB | 00m00s [144/154] libnghttp2-0:1.62.1-2.fc41.x8 100% | 74.9 MiB/s | 76.6 KiB | 00m00s [145/154] libpsl-0:0.21.5-4.fc41.x86_64 100% | 31.3 MiB/s | 64.1 KiB | 00m00s [146/154] libssh-0:0.10.6-8.fc41.x86_64 100% | 103.4 MiB/s | 211.8 KiB | 00m00s [147/154] openldap-0:2.6.8-5.fc41.x86_6 100% | 124.8 MiB/s | 255.6 KiB | 00m00s [148/154] libunistring-0:1.1-8.fc41.x86 100% | 177.4 MiB/s | 544.8 KiB | 00m00s [149/154] publicsuffix-list-dafsa-0:202 100% | 19.0 MiB/s | 58.3 KiB | 00m00s [150/154] libssh-config-0:0.10.6-8.fc41 100% | 4.5 MiB/s | 9.2 KiB | 00m00s [151/154] cyrus-sasl-lib-0:2.1.28-27.fc 100% | 258.7 MiB/s | 794.9 KiB | 00m00s [152/154] libevent-0:2.1.12-14.fc41.x86 100% | 50.3 MiB/s | 257.5 KiB | 00m00s [153/154] libtool-ltdl-0:2.4.7-12.fc41. 100% | 7.0 MiB/s | 35.6 KiB | 00m00s [154/154] gdb-minimal-0:15.1-1.fc41.x86 100% | 52.6 MiB/s | 4.3 MiB | 00m00s -------------------------------------------------------------------------------- [154/154] Total 100% | 124.3 MiB/s | 53.2 MiB | 00m00s Running transaction Importing PGP key 0x105EF944: Userid : "Fedora (42) " Fingerprint: B0F4950458F69E1150C6C5EDC8AC4916105EF944 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-42-primary The key was successfully imported. Importing PGP key 0x105EF944: Userid : "Fedora (42) " Fingerprint: B0F4950458F69E1150C6C5EDC8AC4916105EF944 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-42-primary The key was successfully imported. Importing PGP key 0xE99D6AD1: Userid : "Fedora (41) " Fingerprint: 466CF2D8B60BC3057AA9453ED0622462E99D6AD1 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-41-primary The key was successfully imported. Importing PGP key 0x31645531: Userid : "Fedora (43) " Fingerprint: C6E7F081CF80E13146676E88829B606631645531 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary The key was successfully imported. [ 1/156] Verify package files 100% | 880.0 B/s | 154.0 B | 00m00s >>> Running pre-transaction scriptlet: filesystem-0:3.18-23.fc41.x86_64 >>> Stop pre-transaction scriptlet: filesystem-0:3.18-23.fc41.x86_64 [ 2/156] Prepare transaction 100% | 4.3 KiB/s | 154.0 B | 00m00s [ 3/156] Installing libgcc-0:14.1.1-7. 100% | 269.8 MiB/s | 276.3 KiB | 00m00s >>> Running post-install scriptlet: libgcc-0:14.1.1-7.fc41.x86_64 >>> Stop post-install scriptlet: libgcc-0:14.1.1-7.fc41.x86_64 [ 4/156] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 5/156] Installing publicsuffix-list- 100% | 0.0 B/s | 68.3 KiB | 00m00s [ 6/156] Installing fedora-release-ide 100% | 0.0 B/s | 952.0 B | 00m00s [ 7/156] Installing fedora-repos-rawhi 100% | 0.0 B/s | 2.4 KiB | 00m00s [ 8/156] Installing fedora-gpg-keys-0: 100% | 56.1 MiB/s | 172.2 KiB | 00m00s [ 9/156] Installing fedora-repos-0:42- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 10/156] Installing fedora-release-com 100% | 23.1 MiB/s | 23.7 KiB | 00m00s [ 11/156] Installing fedora-release-0:4 100% | 0.0 B/s | 124.0 B | 00m00s [ 12/156] Installing setup-0:2.15.0-5.f 100% | 59.1 MiB/s | 726.1 KiB | 00m00s >>> Running post-install scriptlet: setup-0:2.15.0-5.fc41.noarch >>> Stop post-install scriptlet: setup-0:2.15.0-5.fc41.noarch [ 13/156] Installing filesystem-0:3.18- 100% | 3.5 MiB/s | 212.5 KiB | 00m00s [ 14/156] Installing basesystem-0:11-21 100% | 0.0 B/s | 124.0 B | 00m00s [ 15/156] Installing pkgconf-m4-0:2.1.1 100% | 0.0 B/s | 14.3 KiB | 00m00s [ 16/156] Installing pcre2-syntax-0:10. 100% | 248.1 MiB/s | 254.1 KiB | 00m00s [ 17/156] Installing ncurses-base-0:6.5 100% | 85.9 MiB/s | 351.7 KiB | 00m00s [ 18/156] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 19/156] Installing ncurses-libs-0:6.5 100% | 239.7 MiB/s | 981.8 KiB | 00m00s >>> Running pre-install scriptlet: glibc-0:2.40-3.fc41.x86_64 >>> Stop pre-install scriptlet: glibc-0:2.40-3.fc41.x86_64 warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.exec(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.exec(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead [ 20/156] Installing glibc-0:2.40-3.fc4 100% | 230.2 MiB/s | 6.7 MiB | 00m00s >>> Running post-install scriptlet: glibc-0:2.40-3.fc41.x86_64 >>> Stop post-install scriptlet: glibc-0:2.40-3.fc41.x86_64 [ 21/156] Installing bash-0:5.2.32-1.fc 100% | 389.0 MiB/s | 8.2 MiB | 00m00s >>> Running post-install scriptlet: bash-0:5.2.32-1.fc41.x86_64 >>> Stop post-install scriptlet: bash-0:5.2.32-1.fc41.x86_64 [ 22/156] Installing glibc-common-0:2.4 100% | 174.5 MiB/s | 1.0 MiB | 00m00s warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.exec(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead [ 23/156] Installing glibc-gconv-extra- 100% | 253.1 MiB/s | 8.1 MiB | 00m00s >>> Running post-install scriptlet: glibc-gconv-extra-0:2.40-3.fc41.x86_64 >>> Stop post-install scriptlet: glibc-gconv-extra-0:2.40-3.fc41.x86_64 [ 24/156] Installing zlib-ng-compat-0:2 100% | 131.7 MiB/s | 134.8 KiB | 00m00s [ 25/156] Installing bzip2-libs-0:1.0.8 100% | 0.0 B/s | 81.8 KiB | 00m00s [ 26/156] Installing xz-libs-1:5.6.2-2. 100% | 105.2 MiB/s | 215.5 KiB | 00m00s [ 27/156] Installing popt-0:1.19-7.fc41 100% | 70.1 MiB/s | 143.5 KiB | 00m00s [ 28/156] Installing readline-0:8.2-10. 100% | 241.8 MiB/s | 495.3 KiB | 00m00s [ 29/156] Installing libuuid-0:2.40.2-4 100% | 0.0 B/s | 38.6 KiB | 00m00s [ 30/156] Installing libblkid-0:2.40.2- 100% | 253.4 MiB/s | 259.5 KiB | 00m00s [ 31/156] Installing libattr-0:2.5.2-4. 100% | 0.0 B/s | 29.5 KiB | 00m00s [ 32/156] Installing libacl-0:2.3.2-2.f 100% | 0.0 B/s | 40.7 KiB | 00m00s [ 33/156] Installing libxcrypt-0:4.4.36 100% | 263.2 MiB/s | 269.5 KiB | 00m00s [ 34/156] Installing libstdc++-0:14.1.1 100% | 346.3 MiB/s | 2.8 MiB | 00m00s [ 35/156] Installing libzstd-0:1.5.6-2. 100% | 389.3 MiB/s | 797.2 KiB | 00m00s [ 36/156] Installing elfutils-libelf-0: 100% | 389.7 MiB/s | 1.2 MiB | 00m00s [ 37/156] Installing gmp-1:6.3.0-2.fc41 100% | 397.3 MiB/s | 813.7 KiB | 00m00s [ 38/156] Installing libeconf-0:0.6.2-3 100% | 58.3 MiB/s | 59.7 KiB | 00m00s [ 39/156] Installing gdbm-libs-1:1.23-7 100% | 120.7 MiB/s | 123.6 KiB | 00m00s [ 40/156] Installing mpfr-0:4.2.1-5.fc4 100% | 271.4 MiB/s | 833.7 KiB | 00m00s [ 41/156] Installing gawk-0:5.3.0-4.fc4 100% | 288.7 MiB/s | 1.7 MiB | 00m00s [ 42/156] Installing dwz-0:0.15-7.fc41. 100% | 285.5 MiB/s | 292.3 KiB | 00m00s [ 43/156] Installing unzip-0:6.0-64.fc4 100% | 190.6 MiB/s | 390.3 KiB | 00m00s [ 44/156] Installing file-libs-0:5.45-7 100% | 620.9 MiB/s | 9.9 MiB | 00m00s [ 45/156] Installing file-0:5.45-7.fc41 100% | 17.1 MiB/s | 105.0 KiB | 00m00s >>> Running pre-install scriptlet: crypto-policies-0:20240807-1.git5795660.fc41. >>> Stop pre-install scriptlet: crypto-policies-0:20240807-1.git5795660.fc41.noa [ 46/156] Installing crypto-policies-0: 100% | 31.9 MiB/s | 163.1 KiB | 00m00s >>> Running post-install scriptlet: crypto-policies-0:20240807-1.git5795660.fc41 >>> Stop post-install scriptlet: crypto-policies-0:20240807-1.git5795660.fc41.no [ 47/156] Installing pcre2-0:10.44-1.fc 100% | 319.8 MiB/s | 654.9 KiB | 00m00s [ 48/156] Installing grep-0:3.11-9.fc41 100% | 200.7 MiB/s | 1.0 MiB | 00m00s [ 49/156] Installing xz-1:5.6.2-2.fc41. 100% | 241.0 MiB/s | 1.2 MiB | 00m00s [ 50/156] Installing libcap-ng-0:0.8.5- 100% | 69.4 MiB/s | 71.0 KiB | 00m00s [ 51/156] Installing audit-libs-0:4.0.2 100% | 325.6 MiB/s | 333.4 KiB | 00m00s [ 52/156] Installing pam-libs-0:1.6.1-5 100% | 138.1 MiB/s | 141.4 KiB | 00m00s [ 53/156] Installing libcap-0:2.70-4.fc 100% | 110.0 MiB/s | 225.2 KiB | 00m00s [ 54/156] Installing systemd-libs-0:256 100% | 290.6 MiB/s | 2.0 MiB | 00m00s [ 55/156] Installing libsmartcols-0:2.4 100% | 177.1 MiB/s | 181.4 KiB | 00m00s [ 56/156] Installing libsepol-0:3.7-2.f 100% | 399.8 MiB/s | 818.8 KiB | 00m00s [ 57/156] Installing libselinux-0:3.7-5 100% | 178.0 MiB/s | 182.3 KiB | 00m00s [ 58/156] Installing sed-0:4.9-3.fc41.x 100% | 212.3 MiB/s | 869.7 KiB | 00m00s [ 59/156] Installing findutils-1:4.10.0 100% | 309.7 MiB/s | 1.9 MiB | 00m00s [ 60/156] Installing libmount-0:2.40.2- 100% | 344.7 MiB/s | 352.9 KiB | 00m00s [ 61/156] Installing lz4-libs-0:1.10.0- 100% | 143.1 MiB/s | 146.6 KiB | 00m00s [ 62/156] Installing lua-libs-0:5.4.6-6 100% | 279.5 MiB/s | 286.2 KiB | 00m00s [ 63/156] Installing libcom_err-0:1.47. 100% | 0.0 B/s | 68.3 KiB | 00m00s [ 64/156] Installing alternatives-0:1.3 100% | 66.3 MiB/s | 67.9 KiB | 00m00s [ 65/156] Installing libtasn1-0:4.19.0- 100% | 173.3 MiB/s | 177.5 KiB | 00m00s [ 66/156] Installing libunistring-0:1.1 100% | 346.1 MiB/s | 1.7 MiB | 00m00s [ 67/156] Installing libidn2-0:2.3.7-2. 100% | 163.6 MiB/s | 335.1 KiB | 00m00s [ 68/156] Installing libpsl-0:0.21.5-4. 100% | 79.7 MiB/s | 81.7 KiB | 00m00s [ 69/156] Installing zstd-0:1.5.6-2.fc4 100% | 422.9 MiB/s | 1.7 MiB | 00m00s [ 70/156] Installing util-linux-core-0: 100% | 247.6 MiB/s | 1.5 MiB | 00m00s [ 71/156] Installing tar-2:1.35-4.fc41. 100% | 369.8 MiB/s | 3.0 MiB | 00m00s [ 72/156] Installing libsemanage-0:3.7- 100% | 144.2 MiB/s | 295.2 KiB | 00m00s [ 73/156] Installing shadow-utils-2:4.1 100% | 167.8 MiB/s | 4.2 MiB | 00m00s >>> Running pre-install scriptlet: libutempter-0:1.2.1-15.fc41.x86_64 >>> Stop pre-install scriptlet: libutempter-0:1.2.1-15.fc41.x86_64 [ 74/156] Installing libutempter-0:1.2. 100% | 58.3 MiB/s | 59.7 KiB | 00m00s [ 75/156] Installing zip-0:3.0-41.fc41. 100% | 345.2 MiB/s | 707.1 KiB | 00m00s [ 76/156] Installing gdbm-1:1.23-7.fc41 100% | 227.4 MiB/s | 465.8 KiB | 00m00s [ 77/156] Installing cyrus-sasl-lib-0:2 100% | 384.3 MiB/s | 2.3 MiB | 00m00s [ 78/156] Installing libfdisk-0:2.40.2- 100% | 355.5 MiB/s | 364.1 KiB | 00m00s [ 79/156] Installing libxml2-0:2.12.8-2 100% | 342.4 MiB/s | 1.7 MiB | 00m00s [ 80/156] Installing bzip2-0:1.0.8-19.f 100% | 97.8 MiB/s | 100.2 KiB | 00m00s [ 81/156] Installing add-determinism-0: 100% | 374.3 MiB/s | 2.2 MiB | 00m00s [ 82/156] Installing build-reproducibil 100% | 0.0 B/s | 1.0 KiB | 00m00s [ 83/156] Installing sqlite-libs-0:3.46 100% | 357.3 MiB/s | 1.4 MiB | 00m00s [ 84/156] Installing ed-0:1.20.2-2.fc41 100% | 145.7 MiB/s | 149.2 KiB | 00m00s [ 85/156] Installing patch-0:2.7.6-25.f 100% | 261.9 MiB/s | 268.2 KiB | 00m00s [ 86/156] Installing elfutils-default-y 100% | 408.6 KiB/s | 2.0 KiB | 00m00s >>> Running post-install scriptlet: elfutils-default-yama-scope-0:0.191-8.fc41.n >>> Stop post-install scriptlet: elfutils-default-yama-scope-0:0.191-8.fc41.noar [ 87/156] Installing elfutils-libs-0:0. 100% | 210.9 MiB/s | 648.0 KiB | 00m00s [ 88/156] Installing cpio-0:2.15-2.fc41 100% | 274.9 MiB/s | 1.1 MiB | 00m00s [ 89/156] Installing diffutils-0:3.10-8 100% | 318.1 MiB/s | 1.6 MiB | 00m00s [ 90/156] Installing libgomp-0:14.1.1-7 100% | 512.6 MiB/s | 524.9 KiB | 00m00s [ 91/156] Installing keyutils-libs-0:1. 100% | 0.0 B/s | 55.8 KiB | 00m00s [ 92/156] Installing libverto-0:0.3.2-9 100% | 0.0 B/s | 31.3 KiB | 00m00s [ 93/156] Installing jansson-0:2.13.1-1 100% | 87.6 MiB/s | 89.7 KiB | 00m00s [ 94/156] Installing libpkgconf-0:2.1.1 100% | 0.0 B/s | 75.3 KiB | 00m00s [ 95/156] Installing pkgconf-0:2.1.1-2. 100% | 83.4 MiB/s | 85.4 KiB | 00m00s [ 96/156] Installing pkgconf-pkg-config 100% | 0.0 B/s | 1.8 KiB | 00m00s [ 97/156] Installing libffi-0:3.4.6-2.f 100% | 85.7 MiB/s | 87.8 KiB | 00m00s [ 98/156] Installing p11-kit-0:0.25.5-3 100% | 275.9 MiB/s | 2.2 MiB | 00m00s [ 99/156] Installing p11-kit-trust-0:0. 100% | 64.0 MiB/s | 393.1 KiB | 00m00s >>> Running post-install scriptlet: p11-kit-trust-0:0.25.5-3.fc41.x86_64 >>> Stop post-install scriptlet: p11-kit-trust-0:0.25.5-3.fc41.x86_64 [100/156] Installing xxhash-libs-0:0.8. 100% | 87.8 MiB/s | 89.9 KiB | 00m00s [101/156] Installing libbrotli-0:1.1.0- 100% | 273.4 MiB/s | 839.9 KiB | 00m00s [102/156] Installing libnghttp2-0:1.62. 100% | 163.2 MiB/s | 167.1 KiB | 00m00s [103/156] Installing libtool-ltdl-0:2.4 100% | 0.0 B/s | 67.3 KiB | 00m00s [104/156] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [105/156] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 732.0 B | 00m00s [106/156] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [107/156] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [108/156] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [109/156] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [110/156] Installing ocaml-srpm-macros- 100% | 0.0 B/s | 2.2 KiB | 00m00s [111/156] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [112/156] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [113/156] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [114/156] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [115/156] Installing ansible-srpm-macro 100% | 35.4 MiB/s | 36.2 KiB | 00m00s [116/156] Installing coreutils-common-0 100% | 399.6 MiB/s | 11.2 MiB | 00m00s [117/156] Installing openssl-libs-1:3.2 100% | 411.9 MiB/s | 7.8 MiB | 00m00s [118/156] Installing coreutils-0:9.5-7. 100% | 295.1 MiB/s | 5.6 MiB | 00m00s >>> Running pre-install scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.fc41.n >>> Stop pre-install scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.fc41.noar [119/156] Installing ca-certificates-0: 100% | 4.0 MiB/s | 2.4 MiB | 00m01s >>> Running post-install scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.fc41. >>> Stop post-install scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.fc41.noa [120/156] Installing krb5-libs-0:1.21.3 100% | 287.4 MiB/s | 2.3 MiB | 00m00s [121/156] Installing libarchive-0:3.7.4 100% | 301.0 MiB/s | 924.6 KiB | 00m00s [122/156] Installing libtirpc-0:1.3.5-0 100% | 199.7 MiB/s | 204.5 KiB | 00m00s [123/156] Installing gzip-0:1.13-2.fc41 100% | 192.7 MiB/s | 394.6 KiB | 00m00s [124/156] Installing authselect-libs-0: 100% | 162.7 MiB/s | 833.2 KiB | 00m00s [125/156] Installing cracklib-0:2.9.11- 100% | 81.5 MiB/s | 250.3 KiB | 00m00s [126/156] Installing libpwquality-0:1.4 100% | 105.0 MiB/s | 430.1 KiB | 00m00s [127/156] Installing libnsl2-0:2.0.1-2. 100% | 57.7 MiB/s | 59.1 KiB | 00m00s [128/156] Installing pam-0:1.6.1-5.fc41 100% | 170.6 MiB/s | 1.9 MiB | 00m00s [129/156] Installing libssh-0:0.10.6-8. 100% | 251.7 MiB/s | 515.4 KiB | 00m00s [130/156] Installing rpm-sequoia-0:1.7. 100% | 394.5 MiB/s | 2.4 MiB | 00m00s [131/156] Installing rpm-libs-0:4.19.92 100% | 353.2 MiB/s | 723.4 KiB | 00m00s [132/156] Installing rpm-build-libs-0:4 100% | 202.6 MiB/s | 207.5 KiB | 00m00s [133/156] Installing libevent-0:2.1.12- 100% | 292.8 MiB/s | 899.5 KiB | 00m00s [134/156] Installing openldap-0:2.6.8-5 100% | 316.4 MiB/s | 648.0 KiB | 00m00s [135/156] Installing libcurl-0:8.9.1-2. 100% | 400.0 MiB/s | 819.2 KiB | 00m00s [136/156] Installing elfutils-debuginfo 100% | 65.3 MiB/s | 66.9 KiB | 00m00s [137/156] Installing elfutils-0:0.191-8 100% | 365.7 MiB/s | 2.6 MiB | 00m00s [138/156] Installing binutils-0:2.42.90 100% | 403.8 MiB/s | 27.5 MiB | 00m00s >>> Running post-install scriptlet: binutils-0:2.42.90-1.fc41.x86_64 >>> Stop post-install scriptlet: binutils-0:2.42.90-1.fc41.x86_64 [139/156] Installing gdb-minimal-0:15.1 100% | 406.0 MiB/s | 13.0 MiB | 00m00s [140/156] Installing debugedit-0:5.0-17 100% | 197.3 MiB/s | 202.0 KiB | 00m00s [141/156] Installing curl-0:8.9.1-2.fc4 100% | 86.7 MiB/s | 798.6 KiB | 00m00s >>> Running pre-install scriptlet: rpm-0:4.19.92-6.fc41.x86_64 >>> Stop pre-install scriptlet: rpm-0:4.19.92-6.fc41.x86_64 [142/156] Installing rpm-0:4.19.92-6.fc 100% | 192.2 MiB/s | 2.5 MiB | 00m00s [143/156] Installing efi-srpm-macros-0: 100% | 0.0 B/s | 41.2 KiB | 00m00s [144/156] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [145/156] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [146/156] Installing fonts-srpm-macros- 100% | 0.0 B/s | 57.0 KiB | 00m00s [147/156] Installing forge-srpm-macros- 100% | 0.0 B/s | 40.4 KiB | 00m00s [148/156] Installing go-srpm-macros-0:3 100% | 0.0 B/s | 62.0 KiB | 00m00s [149/156] Installing python-srpm-macros 100% | 0.0 B/s | 52.2 KiB | 00m00s [150/156] Installing redhat-rpm-config- 100% | 92.8 MiB/s | 190.1 KiB | 00m00s [151/156] Installing rpm-build-0:4.19.9 100% | 99.0 MiB/s | 202.8 KiB | 00m00s [152/156] Installing pyproject-srpm-mac 100% | 2.4 MiB/s | 2.5 KiB | 00m00s [153/156] Installing util-linux-0:2.40. 100% | 187.4 MiB/s | 3.7 MiB | 00m00s >>> Running post-install scriptlet: util-linux-0:2.40.2-4.fc41.x86_64 >>> Stop post-install scriptlet: util-linux-0:2.40.2-4.fc41.x86_64 [154/156] Installing authselect-0:1.5.0 100% | 77.1 MiB/s | 157.9 KiB | 00m00s [155/156] Installing which-0:2.21-42.fc 100% | 80.5 MiB/s | 82.4 KiB | 00m00s warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.exec(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead [156/156] Installing info-0:7.1-3.fc41. 100% | 442.2 KiB/s | 362.2 KiB | 00m01s >>> Running post-transaction scriptlet: filesystem-0:3.18-23.fc41.x86_64 >>> Stop post-transaction scriptlet: filesystem-0:3.18-23.fc41.x86_64 >>> Running post-transaction scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.f >>> Stop post-transaction scriptlet: ca-certificates-0:2024.2.68_v8.0.302-3.fc41 >>> Running post-transaction scriptlet: authselect-libs-0:1.5.0-7.fc41.x86_64 >>> Stop post-transaction scriptlet: authselect-libs-0:1.5.0-7.fc41.x86_64 >>> Running post-transaction scriptlet: rpm-0:4.19.92-6.fc41.x86_64 >>> Stop post-transaction scriptlet: rpm-0:4.19.92-6.fc41.x86_64 >>> Running trigger-install scriptlet: glibc-common-0:2.40-3.fc41.x86_64 >>> Stop trigger-install scriptlet: glibc-common-0:2.40-3.fc41.x86_64 >>> Running trigger-install scriptlet: info-0:7.1-3.fc41.x86_64 >>> Stop trigger-install scriptlet: info-0:7.1-3.fc41.x86_64 Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.3.6-1.fc41.x86_64 alternatives-1.30-1.fc41.x86_64 ansible-srpm-macros-1-16.fc41.noarch audit-libs-4.0.2-1.fc41.x86_64 authselect-1.5.0-7.fc41.x86_64 authselect-libs-1.5.0-7.fc41.x86_64 basesystem-11-21.fc41.noarch bash-5.2.32-1.fc41.x86_64 binutils-2.42.90-1.fc41.x86_64 build-reproducibility-srpm-macros-0.3.6-1.fc41.noarch bzip2-1.0.8-19.fc41.x86_64 bzip2-libs-1.0.8-19.fc41.x86_64 ca-certificates-2024.2.68_v8.0.302-3.fc41.noarch coreutils-9.5-7.fc41.x86_64 coreutils-common-9.5-7.fc41.x86_64 cpio-2.15-2.fc41.x86_64 cracklib-2.9.11-6.fc41.x86_64 crypto-policies-20240807-1.git5795660.fc41.noarch curl-8.9.1-2.fc41.x86_64 cyrus-sasl-lib-2.1.28-27.fc41.x86_64 debugedit-5.0-17.fc41.x86_64 diffutils-3.10-8.fc41.x86_64 dwz-0.15-7.fc41.x86_64 ed-1.20.2-2.fc41.x86_64 efi-srpm-macros-5-12.fc41.noarch elfutils-0.191-8.fc41.x86_64 elfutils-debuginfod-client-0.191-8.fc41.x86_64 elfutils-default-yama-scope-0.191-8.fc41.noarch elfutils-libelf-0.191-8.fc41.x86_64 elfutils-libs-0.191-8.fc41.x86_64 fedora-gpg-keys-42-0.1.noarch fedora-release-42-0.1.noarch fedora-release-common-42-0.1.noarch fedora-release-identity-basic-42-0.1.noarch fedora-repos-42-0.1.noarch fedora-repos-rawhide-42-0.1.noarch file-5.45-7.fc41.x86_64 file-libs-5.45-7.fc41.x86_64 filesystem-3.18-23.fc41.x86_64 findutils-4.10.0-4.fc41.x86_64 fonts-srpm-macros-2.0.5-17.fc41.noarch forge-srpm-macros-0.3.2-1.fc41.noarch fpc-srpm-macros-1.3-13.fc41.noarch gawk-5.3.0-4.fc41.x86_64 gdb-minimal-15.1-1.fc41.x86_64 gdbm-1.23-7.fc41.x86_64 gdbm-libs-1.23-7.fc41.x86_64 ghc-srpm-macros-1.9.1-2.fc41.noarch glibc-2.40-3.fc41.x86_64 glibc-common-2.40-3.fc41.x86_64 glibc-gconv-extra-2.40-3.fc41.x86_64 glibc-minimal-langpack-2.40-3.fc41.x86_64 gmp-6.3.0-2.fc41.x86_64 gnat-srpm-macros-6-6.fc41.noarch go-srpm-macros-3.6.0-3.fc41.noarch gpg-pubkey-105ef944-65ca83d1 gpg-pubkey-31645531-66b6dccf gpg-pubkey-e99d6ad1-64d2612c grep-3.11-9.fc41.x86_64 gzip-1.13-2.fc41.x86_64 info-7.1-3.fc41.x86_64 jansson-2.13.1-10.fc41.x86_64 kernel-srpm-macros-1.0-24.fc41.noarch keyutils-libs-1.6.3-4.fc41.x86_64 krb5-libs-1.21.3-2.fc41.x86_64 libacl-2.3.2-2.fc41.x86_64 libarchive-3.7.4-3.fc41.x86_64 libattr-2.5.2-4.fc41.x86_64 libblkid-2.40.2-4.fc41.x86_64 libbrotli-1.1.0-5.fc41.x86_64 libcap-2.70-4.fc41.x86_64 libcap-ng-0.8.5-3.fc41.x86_64 libcom_err-1.47.1-3.fc41.x86_64 libcurl-8.9.1-2.fc41.x86_64 libeconf-0.6.2-3.fc41.x86_64 libevent-2.1.12-14.fc41.x86_64 libfdisk-2.40.2-4.fc41.x86_64 libffi-3.4.6-2.fc41.x86_64 libgcc-14.1.1-7.fc41.x86_64 libgomp-14.1.1-7.fc41.x86_64 libidn2-2.3.7-2.fc41.x86_64 libmount-2.40.2-4.fc41.x86_64 libnghttp2-1.62.1-2.fc41.x86_64 libnsl2-2.0.1-2.fc41.x86_64 libpkgconf-2.1.1-2.fc41.x86_64 libpsl-0.21.5-4.fc41.x86_64 libpwquality-1.4.5-11.fc41.x86_64 libselinux-3.7-5.fc41.x86_64 libsemanage-3.7-2.fc41.x86_64 libsepol-3.7-2.fc41.x86_64 libsmartcols-2.40.2-4.fc41.x86_64 libssh-0.10.6-8.fc41.x86_64 libssh-config-0.10.6-8.fc41.noarch libstdc++-14.1.1-7.fc41.x86_64 libtasn1-4.19.0-9.fc41.x86_64 libtirpc-1.3.5-0.fc41.x86_64 libtool-ltdl-2.4.7-12.fc41.x86_64 libunistring-1.1-8.fc41.x86_64 libutempter-1.2.1-15.fc41.x86_64 libuuid-2.40.2-4.fc41.x86_64 libverto-0.3.2-9.fc41.x86_64 libxcrypt-4.4.36-7.fc41.x86_64 libxml2-2.12.8-2.fc41.x86_64 libzstd-1.5.6-2.fc41.x86_64 lua-libs-5.4.6-6.fc41.x86_64 lua-srpm-macros-1-14.fc41.noarch lz4-libs-1.10.0-1.fc41.x86_64 mpfr-4.2.1-5.fc41.x86_64 ncurses-base-6.5-2.20240629.fc41.noarch ncurses-libs-6.5-2.20240629.fc41.x86_64 ocaml-srpm-macros-10-3.fc41.noarch openblas-srpm-macros-2-18.fc41.noarch openldap-2.6.8-5.fc41.x86_64 openssl-libs-3.2.2-5.fc41.x86_64 p11-kit-0.25.5-3.fc41.x86_64 p11-kit-trust-0.25.5-3.fc41.x86_64 package-notes-srpm-macros-0.5-12.fc41.noarch pam-1.6.1-5.fc41.x86_64 pam-libs-1.6.1-5.fc41.x86_64 patch-2.7.6-25.fc41.x86_64 pcre2-10.44-1.fc41.1.x86_64 pcre2-syntax-10.44-1.fc41.1.noarch perl-srpm-macros-1-56.fc41.noarch pkgconf-2.1.1-2.fc41.x86_64 pkgconf-m4-2.1.1-2.fc41.noarch pkgconf-pkg-config-2.1.1-2.fc41.x86_64 popt-1.19-7.fc41.x86_64 publicsuffix-list-dafsa-20240107-4.fc41.noarch pyproject-srpm-macros-1.14.0-1.fc41.noarch python-srpm-macros-3.13-3.fc41.noarch qt5-srpm-macros-5.15.14-3.fc41.noarch qt6-srpm-macros-6.7.2-3.fc41.noarch readline-8.2-10.fc41.x86_64 redhat-rpm-config-293-1.fc41.noarch rpm-4.19.92-6.fc41.x86_64 rpm-build-4.19.92-6.fc41.x86_64 rpm-build-libs-4.19.92-6.fc41.x86_64 rpm-libs-4.19.92-6.fc41.x86_64 rpm-sequoia-1.7.0-2.fc41.x86_64 rust-srpm-macros-26.3-1.fc41.noarch sed-4.9-3.fc41.x86_64 setup-2.15.0-5.fc41.noarch shadow-utils-4.15.1-9.fc41.x86_64 sqlite-libs-3.46.0-4.fc41.x86_64 systemd-libs-256.4-1.fc41.x86_64 tar-1.35-4.fc41.x86_64 unzip-6.0-64.fc41.x86_64 util-linux-2.40.2-4.fc41.x86_64 util-linux-core-2.40.2-4.fc41.x86_64 which-2.21-42.fc41.x86_64 xxhash-libs-0.8.2-3.fc41.x86_64 xz-5.6.2-2.fc41.x86_64 xz-libs-5.6.2-2.fc41.x86_64 zig-srpm-macros-1-3.fc41.noarch zip-3.0-41.fc41.x86_64 zlib-ng-compat-2.1.7-2.fc41.x86_64 zstd-1.5.6-2.fc41.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 sh: -c: line 1: unexpected EOF while looking for matching `"' setting SOURCE_DATE_EPOCH=1636416000 Wrote: /builddir/build/SRPMS/cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Finish: rpmbuild -bs cp: preserving permissions for ‘/var/lib/copr-rpmbuild/results/chroot_scan/var/lib/mock/fedora-rawhide-x86_64-1723840744.707232/root/var/log’: No such file or directory INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-rawhide-x86_64-1723840744.707232/root/var/log/dnf5.log Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-4kvkk6lq/cutlass/cutlass.spec) Config(child) 0 minutes 18 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm) Config(fedora-rawhide-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1723840744.707232/root. INFO: reusing tmpfs at /var/lib/mock/fedora-rawhide-x86_64-bootstrap-1723840744.707232/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-rawhide-x86_64-1723840744.707232/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.19.92-6.fc41.x86_64 rpm-sequoia-1.7.0-2.fc41.x86_64 dnf5-5.2.5.0-2.fc41.x86_64 dnf5-plugins-5.2.5.0-2.fc41.x86_64 Finish: chroot init Start: build phase for cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Start: build setup for cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Building target platforms: x86_64 Building for target x86_64 sh: -c: line 1: unexpected EOF while looking for matching `"' setting SOURCE_DATE_EPOCH=1636416000 Wrote: /builddir/build/SRPMS/cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Updating and loading repositories: Additional repo copr_rezso_CUDA 100% | 121.6 KiB/s | 1.8 KiB | 00m00s fedora 100% | 48.0 KiB/s | 6.8 KiB | 00m00s Copr repository 100% | 140.8 KiB/s | 1.8 KiB | 00m00s Additional repo http_developer_downloa 100% | 580.2 KiB/s | 3.5 KiB | 00m00s Additional repo http_developer_downloa 100% | 695.5 KiB/s | 3.5 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.28.3-7.fc41 fedora 31.6 MiB cuda-cudart-devel-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 6.8 MiB cuda-driver-devel-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 130.5 KiB cuda-gcc-11-c++ x86_64 11.2.1-1.fc39 copr_base 53.8 MiB cuda-nvcc-12-6 x86_64 12.6.20-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 204.3 MiB cuda-nvml-devel-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 1.4 MiB cuda-nvrtc-devel-12-6 x86_64 12.6.20-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 92.9 MiB cuda-nvtx-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 406.2 KiB doxygen x86_64 2:1.12.0-1.fc41 fedora 19.2 MiB gcc-c++ x86_64 14.1.1-7.fc41 fedora 38.1 MiB git x86_64 2.46.0-1.fc41 fedora 85.2 KiB graphviz x86_64 12.0.0-3.fc41 fedora 20.0 MiB libcublas-devel-12-6 x86_64 12.6.0.22-2 copr_rezso_CUDA 1.1 MiB libcudnn8 x86_64 8.9.7.29-2.cuda12.3 copr_rezso_CUDA 1.0 GiB libcudnn8-devel x86_64 8.9.7.29-2.cuda12.3 copr_rezso_CUDA 199.2 KiB libcurand-devel-12-6 x86_64 10.3.7.37-2 copr_rezso_CUDA 2.1 MiB python3-devel x86_64 3.13.0~rc1-2.fc41 fedora 1.8 MiB python3-setuptools noarch 69.2.0-8.fc41 fedora 7.2 MiB Installing dependencies: abattis-cantarell-vf-fonts noarch 0.301-13.fc41 fedora 192.7 KiB adobe-mappings-cmap noarch 20230622-4.fc41 fedora 14.4 MiB adobe-mappings-cmap-deprecated noarch 20230622-4.fc41 fedora 582.1 KiB adobe-mappings-pdf noarch 20190401-8.fc41 fedora 4.4 MiB annobin-docs noarch 12.70-1.fc42 fedora 97.7 KiB annobin-plugin-gcc x86_64 12.70-1.fc42 fedora 985.6 KiB avahi-libs x86_64 0.8-29.fc41 fedora 166.3 KiB cairo x86_64 1.18.0-4.fc41 fedora 1.7 MiB cairo-gobject x86_64 1.18.0-4.fc41 fedora 35.2 KiB cmake-data noarch 3.28.3-7.fc41 fedora 8.0 MiB cmake-filesystem x86_64 3.28.3-7.fc41 fedora 0.0 B cmake-rpm-macros noarch 3.28.3-7.fc41 fedora 7.5 KiB cpp x86_64 14.1.1-7.fc41 fedora 35.0 MiB cuda-cccl-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 11.6 MiB cuda-crt-12-6 x86_64 12.6.20-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 855.7 KiB cuda-cudart-12-6 x86_64 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 760.9 KiB cuda-gcc-11 x86_64 11.2.1-1.fc39 copr_base 95.9 MiB cuda-nvrtc-12-6 x86_64 12.6.20-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 60.4 MiB cuda-nvvm-12-6 x86_64 12.6.20-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 56.0 MiB cuda-toolkit-12-6-config-common noarch 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 0.0 B cuda-toolkit-12-config-common noarch 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 44.0 B cuda-toolkit-config-common noarch 12.6.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 41.0 B cups-libs x86_64 1:2.4.10-6.fc42 fedora 622.9 KiB dbus-libs x86_64 1:1.14.10-4.fc41 fedora 368.9 KiB default-fonts-core-sans noarch 4.1-3.fc42 fedora 11.9 KiB emacs-filesystem noarch 1:30.0-3.fc41 fedora 0.0 B expat x86_64 2.6.2-2.fc41 fedora 284.8 KiB fontconfig x86_64 2.15.0-8.fc41 fedora 791.9 KiB fonts-filesystem noarch 1:2.0.5-17.fc41 fedora 0.0 B freetype x86_64 2.13.2-6.fc41 fedora 850.6 KiB fribidi x86_64 1.0.15-2.fc41 fedora 368.4 KiB gcc x86_64 14.1.1-7.fc41 fedora 104.1 MiB gcc-plugin-annobin x86_64 14.1.1-7.fc41 fedora 57.1 KiB gd x86_64 2.3.3-17.fc41 fedora 403.7 KiB gdk-pixbuf2 x86_64 2.42.12-6.fc41 fedora 2.5 MiB git-core x86_64 2.46.0-1.fc41 fedora 22.1 MiB git-core-doc noarch 2.46.0-1.fc41 fedora 17.1 MiB glib2 x86_64 2.81.1-1.fc41 fedora 14.6 MiB glibc-devel x86_64 2.40-3.fc41 fedora 35.0 KiB glibc-headers-x86 noarch 2.40-3.fc41 fedora 2.2 MiB gnupg2 x86_64 2.4.5-3.fc41 fedora 9.5 MiB gnutls x86_64 3.8.7-1.fc42 fedora 3.2 MiB google-droid-sans-fonts noarch 20200215-21.fc41 fedora 6.3 MiB google-noto-fonts-common noarch 20240701-2.fc41 fedora 17.5 KiB google-noto-sans-vf-fonts noarch 20240701-2.fc41 fedora 1.2 MiB gpgme x86_64 1.23.2-5.fc41 fedora 579.3 KiB gpgmepp x86_64 1.23.2-5.fc41 fedora 424.2 KiB graphite2 x86_64 1.3.14-16.fc41 fedora 192.0 KiB graphviz-libs x86_64 12.0.0-3.fc41 fedora 1.2 MiB groff-base x86_64 1.23.0-7.fc41 fedora 3.8 MiB gts x86_64 0.7.6-49.20121130.fc41 fedora 650.2 KiB harfbuzz x86_64 9.0.0-2.fc41 fedora 2.7 MiB isl x86_64 0.16.1-21.fc41 fedora 3.0 MiB jbig2dec-libs x86_64 0.20-5.fc41 fedora 169.0 KiB jbigkit-libs x86_64 2.1-30.fc41 fedora 117.6 KiB json-c x86_64 0.17-4.fc41 fedora 82.4 KiB jsoncpp x86_64 1.9.5-8.fc41 fedora 253.4 KiB kernel-headers x86_64 6.11.0-0.rc3.30.fc41 fedora 6.4 MiB lasi x86_64 1.1.3-14.fc41 fedora 130.8 KiB lcms2 x86_64 2.16-4.fc41 fedora 424.9 KiB less x86_64 661-2.fc41 fedora 405.3 KiB libICE x86_64 1.1.1-4.fc41 fedora 181.2 KiB libSM x86_64 1.2.4-4.fc41 fedora 97.3 KiB libX11 x86_64 1.8.10-1.fc41 fedora 1.3 MiB libX11-common noarch 1.8.10-1.fc41 fedora 1.1 MiB libXau x86_64 1.0.11-7.fc41 fedora 66.9 KiB libXext x86_64 1.3.6-2.fc41 fedora 90.1 KiB libXft x86_64 2.3.8-7.fc41 fedora 164.5 KiB libXpm x86_64 3.5.17-4.fc41 fedora 148.4 KiB libXrender x86_64 0.9.11-7.fc41 fedora 50.1 KiB libXt x86_64 1.3.0-4.fc41 fedora 429.9 KiB libaom x86_64 3.9.0-3.fc41 fedora 5.1 MiB libassuan x86_64 2.5.7-2.fc41 fedora 163.8 KiB libavif x86_64 1.0.4-7.fc41 fedora 183.8 KiB libb2 x86_64 0.98.1-12.fc41 fedora 42.2 KiB libcbor x86_64 0.11.0-2.fc41 fedora 73.9 KiB libcublas-12-6 x86_64 12.6.0.22-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 533.1 MiB libcurand-12-6 x86_64 10.3.7.37-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 92.1 MiB libdatrie x86_64 0.2.13-10.fc41 fedora 57.9 KiB libdav1d x86_64 1.4.3-2.fc41 fedora 1.7 MiB libedit x86_64 3.1-53.20240808cvs.fc41 fedora 244.1 KiB libfido2 x86_64 1.15.0-2.fc41 fedora 238.2 KiB libgcrypt x86_64 1.11.0-3.fc41 fedora 1.5 MiB libgpg-error x86_64 1.50-2.fc41 fedora 889.5 KiB libgs x86_64 10.03.1-3.fc41 fedora 23.2 MiB libijs x86_64 0.35-23.fc41 fedora 61.6 KiB libimagequant x86_64 4.0.3-5.fc41 fedora 666.7 KiB libjpeg-turbo x86_64 3.0.2-3.fc41 fedora 776.9 KiB libksba x86_64 1.6.7-2.fc41 fedora 398.4 KiB liblerc x86_64 4.0.0-7.fc41 fedora 607.5 KiB libmpc x86_64 1.3.1-6.fc41 fedora 164.7 KiB libpaper x86_64 1:2.1.1-7.fc41 fedora 48.9 KiB libpng x86_64 2:1.6.40-4.fc41 fedora 245.8 KiB librsvg2 x86_64 2.57.1-8.fc41 fedora 4.1 MiB libstdc++-devel x86_64 14.1.1-7.fc41 fedora 15.4 MiB libthai x86_64 0.1.29-9.fc41 fedora 783.5 KiB libtiff x86_64 4.6.0-6.fc42 fedora 606.0 KiB libuv x86_64 1:1.48.0-2.fc41 fedora 546.8 KiB libvmaf x86_64 3.0.0-2.fc41 fedora 823.0 KiB libwebp x86_64 1.4.0-4.fc41 fedora 822.6 KiB libxcb x86_64 1.17.0-2.fc41 fedora 1.1 MiB libxcrypt-devel x86_64 4.4.36-7.fc41 fedora 30.3 KiB make x86_64 1:4.4.1-8.fc41 fedora 1.8 MiB mpdecimal x86_64 2.5.1-16.fc41 fedora 204.9 KiB ncurses x86_64 6.5-2.20240629.fc41 fedora 627.3 KiB netpbm x86_64 11.02.00-7.fc41 fedora 577.1 KiB nettle x86_64 3.10-3.fc41 fedora 793.0 KiB npth x86_64 1.7-2.fc41 fedora 49.6 KiB nspr x86_64 4.35.0-28.fc41 fedora 320.4 KiB nss x86_64 3.103.0-1.fc41 fedora 1.9 MiB nss-softokn x86_64 3.103.0-1.fc41 fedora 1.9 MiB nss-softokn-freebl x86_64 3.103.0-1.fc41 fedora 783.6 KiB nss-sysinit x86_64 3.103.0-1.fc41 fedora 22.2 KiB nss-util x86_64 3.103.0-1.fc41 fedora 205.1 KiB openjpeg x86_64 2.5.2-3.fc41 fedora 445.7 KiB openssh x86_64 9.8p1-3.fc41 fedora 1.8 MiB openssh-clients x86_64 9.8p1-3.fc41 fedora 2.6 MiB pango x86_64 1.54.0-2.fc41 fedora 996.2 KiB perl-AutoLoader noarch 5.74-511.fc41 fedora 20.5 KiB perl-B x86_64 1.89-511.fc41 fedora 498.0 KiB perl-Carp noarch 1.54-511.fc41 fedora 46.6 KiB perl-Class-Struct noarch 0.68-511.fc41 fedora 25.4 KiB perl-Data-Dumper x86_64 2.189-512.fc41 fedora 111.7 KiB perl-Digest noarch 1.20-511.fc41 fedora 35.3 KiB perl-Digest-MD5 x86_64 2.59-5.fc41 fedora 59.8 KiB perl-DynaLoader x86_64 1.56-511.fc41 fedora 32.1 KiB perl-Encode x86_64 4:3.21-511.fc41 fedora 4.7 MiB perl-Errno x86_64 1.38-511.fc41 fedora 8.4 KiB perl-Error noarch 1:0.17029-16.fc41 fedora 77.3 KiB perl-Exporter noarch 5.78-511.fc41 fedora 54.3 KiB perl-Fcntl x86_64 1.18-511.fc41 fedora 49.0 KiB perl-File-Basename noarch 2.86-511.fc41 fedora 14.0 KiB perl-File-Find noarch 1.44-511.fc41 fedora 41.9 KiB perl-File-Path noarch 2.18-511.fc41 fedora 63.5 KiB perl-File-Temp noarch 1:0.231.100-511.fc41 fedora 162.3 KiB perl-File-stat noarch 1.14-511.fc41 fedora 12.5 KiB perl-FileHandle noarch 2.05-511.fc41 fedora 9.3 KiB perl-Getopt-Long noarch 1:2.58-2.fc41 fedora 144.5 KiB perl-Getopt-Std noarch 1.14-511.fc41 fedora 11.2 KiB perl-Git noarch 2.46.0-1.fc41 fedora 64.1 KiB perl-HTTP-Tiny noarch 0.088-512.fc41 fedora 152.2 KiB perl-IO x86_64 1.55-511.fc41 fedora 151.1 KiB perl-IO-Socket-IP noarch 0.42-512.fc41 fedora 98.7 KiB perl-IO-Socket-SSL noarch 2.088-2.fc41 fedora 701.1 KiB perl-IPC-Open3 noarch 1.22-511.fc41 fedora 22.5 KiB perl-MIME-Base64 x86_64 3.16-511.fc41 fedora 46.1 KiB perl-Mozilla-CA noarch 20240730-1.fc41 fedora 9.8 KiB perl-Net-SSLeay x86_64 1.94-7.fc41 fedora 1.3 MiB perl-POSIX x86_64 2.20-511.fc41 fedora 235.1 KiB perl-PathTools x86_64 3.91-511.fc41 fedora 180.0 KiB perl-Pod-Escapes noarch 1:1.07-511.fc41 fedora 24.9 KiB perl-Pod-Perldoc noarch 3.28.01-512.fc41 fedora 163.7 KiB perl-Pod-Simple noarch 1:3.45-511.fc41 fedora 560.9 KiB perl-Pod-Usage noarch 4:2.03-511.fc41 fedora 84.8 KiB perl-Scalar-List-Utils x86_64 5:1.65-1.fc41 fedora 146.5 KiB perl-SelectSaver noarch 1.02-511.fc41 fedora 2.2 KiB perl-Socket x86_64 4:2.038-511.fc41 fedora 124.0 KiB perl-Storable x86_64 1:3.32-511.fc41 fedora 232.4 KiB perl-Symbol noarch 1.09-511.fc41 fedora 6.8 KiB perl-Term-ANSIColor noarch 5.01-512.fc41 fedora 97.5 KiB perl-Term-Cap noarch 1.18-511.fc41 fedora 29.3 KiB perl-TermReadKey x86_64 2.38-23.fc41 fedora 64.1 KiB perl-Text-ParseWords noarch 3.31-511.fc41 fedora 13.6 KiB perl-Text-Tabs+Wrap noarch 2024.001-511.fc41 fedora 22.6 KiB perl-Time-Local noarch 2:1.350-511.fc41 fedora 69.0 KiB perl-URI noarch 5.28-2.fc41 fedora 240.7 KiB perl-base noarch 2.27-511.fc41 fedora 12.5 KiB perl-constant noarch 1.33-512.fc41 fedora 26.2 KiB perl-if noarch 0.61.000-511.fc41 fedora 5.8 KiB perl-interpreter x86_64 4:5.40.0-511.fc41 fedora 122.3 KiB perl-lib x86_64 0.65-511.fc41 fedora 8.5 KiB perl-libnet noarch 3.15-512.fc41 fedora 289.4 KiB perl-libs x86_64 4:5.40.0-511.fc41 fedora 9.9 MiB perl-locale noarch 1.12-511.fc41 fedora 6.5 KiB perl-mro x86_64 1.29-511.fc41 fedora 45.6 KiB perl-overload noarch 1.37-511.fc41 fedora 71.5 KiB perl-overloading noarch 0.02-511.fc41 fedora 4.8 KiB perl-parent noarch 1:0.242-1.fc42 fedora 10.0 KiB perl-podlators noarch 1:6.0.2-2.fc41 fedora 317.5 KiB perl-vars noarch 1.05-511.fc41 fedora 3.9 KiB pixman x86_64 0.43.4-2.fc41 fedora 718.1 KiB poppler x86_64 24.02.0-4.fc41 fedora 3.5 MiB poppler-data noarch 0.4.11-8.fc41 fedora 12.3 MiB poppler-glib x86_64 24.02.0-4.fc41 fedora 575.1 KiB pyproject-rpm-macros noarch 1.14.0-1.fc41 fedora 105.7 KiB python-pip-wheel noarch 24.2-1.fc41 fedora 1.2 MiB python-rpm-macros noarch 3.13-3.fc41 fedora 22.1 KiB python3 x86_64 3.13.0~rc1-2.fc41 fedora 31.8 KiB python3-libs x86_64 3.13.0~rc1-2.fc41 fedora 40.3 MiB python3-packaging noarch 24.1-2.fc41 fedora 422.3 KiB python3-rpm-generators noarch 14-11.fc41 fedora 81.7 KiB python3-rpm-macros noarch 3.13-3.fc41 fedora 6.4 KiB rav1e-libs x86_64 0.7.1-3.fc41 fedora 3.0 MiB rhash x86_64 1.4.4-2.fc41 fedora 349.9 KiB rsvg-pixbuf-loader x86_64 2.57.1-8.fc41 fedora 15.5 KiB shared-mime-info x86_64 2.3-6.fc41 fedora 5.2 MiB svt-av1-libs x86_64 2.1.0-2.fc41 fedora 7.1 MiB tpm2-tss x86_64 4.1.3-3.fc41 fedora 1.6 MiB tzdata noarch 2024a-9.fc41 fedora 1.7 MiB urw-base35-bookman-fonts noarch 20200910-23.fc41 fedora 1.4 MiB urw-base35-c059-fonts noarch 20200910-23.fc41 fedora 1.4 MiB urw-base35-d050000l-fonts noarch 20200910-23.fc41 fedora 84.3 KiB urw-base35-fonts noarch 20200910-23.fc41 fedora 5.3 KiB urw-base35-fonts-common noarch 20200910-23.fc41 fedora 37.4 KiB urw-base35-gothic-fonts noarch 20200910-23.fc41 fedora 1.2 MiB urw-base35-nimbus-mono-ps-fonts noarch 20200910-23.fc41 fedora 1.0 MiB urw-base35-nimbus-roman-fonts noarch 20200910-23.fc41 fedora 1.4 MiB urw-base35-nimbus-sans-fonts noarch 20200910-23.fc41 fedora 2.4 MiB urw-base35-p052-fonts noarch 20200910-23.fc41 fedora 1.5 MiB urw-base35-standard-symbols-ps-fonts noarch 20200910-23.fc41 fedora 64.9 KiB urw-base35-z003-fonts noarch 20200910-23.fc41 fedora 390.8 KiB vim-filesystem noarch 2:9.1.660-2.fc41 fedora 40.0 B xapian-core-libs x86_64 1.4.23-3.fc41 fedora 2.1 MiB xml-common noarch 0.6.3-65.fc41 fedora 78.4 KiB Transaction Summary: Installing: 232 packages Total size of inbound packages is 1 GiB. Need to download 1 GiB. After this operation 3 GiB will be used (install 3 GiB, remove 0 B). [ 1/232] cuda-driver-devel-12-6-0:12.6 100% | 6.0 MiB/s | 42.7 KiB | 00m00s [ 2/232] cuda-nvml-devel-12-6-0:12.6.3 100% | 31.0 MiB/s | 222.5 KiB | 00m00s [ 3/232] cuda-cudart-devel-12-6-0:12.6 100% | 60.4 MiB/s | 2.1 MiB | 00m00s [ 4/232] cuda-nvtx-12-6-0:12.6.37-1.x8 100% | 21.4 MiB/s | 87.8 KiB | 00m00s [ 5/232] cuda-nvrtc-devel-12-6-0:12.6. 100% | 272.0 MiB/s | 28.6 MiB | 00m00s [ 6/232] git-0:2.46.0-1.fc41.x86_64 100% | 3.9 MiB/s | 52.4 KiB | 00m00s [ 7/232] doxygen-2:1.12.0-1.fc41.x86_6 100% | 53.2 MiB/s | 5.5 MiB | 00m00s [ 8/232] python3-setuptools-0:69.2.0-8 100% | 97.8 MiB/s | 1.6 MiB | 00m00s [ 9/232] gcc-c++-0:14.1.1-7.fc41.x86_6 100% | 283.4 MiB/s | 14.2 MiB | 00m00s [ 10/232] cuda-nvcc-12-6-0:12.6.20-1.x8 100% | 255.8 MiB/s | 67.3 MiB | 00m00s [ 11/232] graphviz-0:12.0.0-3.fc41.x86_ 100% | 62.0 MiB/s | 4.5 MiB | 00m00s [ 12/232] libcublas-devel-12-6-0:12.6.0 100% | 6.7 MiB/s | 89.1 KiB | 00m00s [ 13/232] libcudnn8-devel-0:8.9.7.29-2. 100% | 8.2 MiB/s | 33.6 KiB | 00m00s [ 14/232] libcurand-devel-12-6-0:10.3.7 100% | 22.0 MiB/s | 248.1 KiB | 00m00s [ 15/232] python3-devel-0:3.13.0~rc1-2. 100% | 65.4 MiB/s | 401.8 KiB | 00m00s [ 16/232] cuda-cccl-12-6-0:12.6.37-1.x8 100% | 85.5 MiB/s | 1.6 MiB | 00m00s [ 17/232] cuda-cudart-12-6-0:12.6.37-1. 100% | 27.6 MiB/s | 225.9 KiB | 00m00s [ 18/232] cuda-crt-12-6-0:12.6.20-1.x86 100% | 26.7 MiB/s | 109.5 KiB | 00m00s [ 19/232] cuda-nvvm-12-6-0:12.6.20-1.x8 100% | 345.5 MiB/s | 23.5 MiB | 00m00s [ 20/232] cmake-0:3.28.3-7.fc41.x86_64 100% | 36.3 MiB/s | 9.8 MiB | 00m00s [ 21/232] perl-interpreter-4:5.40.0-511 100% | 14.1 MiB/s | 72.3 KiB | 00m00s [ 22/232] xapian-core-libs-0:1.4.23-3.f 100% | 36.3 MiB/s | 779.5 KiB | 00m00s [ 23/232] cuda-nvrtc-12-6-0:12.6.20-1.x 100% | 324.7 MiB/s | 22.4 MiB | 00m00s [ 24/232] git-core-0:2.46.0-1.fc41.x86_ 100% | 48.0 MiB/s | 4.8 MiB | 00m00s [ 25/232] perl-File-Basename-0:2.86-511 100% | 8.4 MiB/s | 17.1 KiB | 00m00s [ 26/232] perl-File-Find-0:1.44-511.fc4 100% | 12.3 MiB/s | 25.3 KiB | 00m00s [ 27/232] perl-Getopt-Long-1:2.58-2.fc4 100% | 15.6 MiB/s | 63.9 KiB | 00m00s [ 28/232] perl-Git-0:2.46.0-1.fc41.noar 100% | 7.7 MiB/s | 39.2 KiB | 00m00s [ 29/232] perl-IPC-Open3-0:1.22-511.fc4 100% | 10.7 MiB/s | 21.8 KiB | 00m00s [ 30/232] perl-PathTools-0:3.91-511.fc4 100% | 14.2 MiB/s | 87.4 KiB | 00m00s [ 31/232] perl-TermReadKey-0:2.38-23.fc 100% | 11.6 MiB/s | 35.6 KiB | 00m00s [ 32/232] perl-lib-0:0.65-511.fc41.x86_ 100% | 2.9 MiB/s | 14.9 KiB | 00m00s [ 33/232] git-core-doc-0:2.46.0-1.fc41. 100% | 25.9 MiB/s | 3.0 MiB | 00m00s [ 34/232] cmake-filesystem-0:3.28.3-7.f 100% | 17.8 MiB/s | 18.2 KiB | 00m00s [ 35/232] expat-0:2.6.2-2.fc41.x86_64 100% | 55.4 MiB/s | 113.4 KiB | 00m00s [ 36/232] jsoncpp-0:1.9.5-8.fc41.x86_64 100% | 19.4 MiB/s | 99.3 KiB | 00m00s [ 37/232] libuv-1:1.48.0-2.fc41.x86_64 100% | 35.7 MiB/s | 256.1 KiB | 00m00s [ 38/232] make-1:4.4.1-8.fc41.x86_64 100% | 143.1 MiB/s | 586.1 KiB | 00m00s [ 39/232] rhash-0:1.4.4-2.fc41.x86_64 100% | 38.3 MiB/s | 196.0 KiB | 00m00s [ 40/232] cmake-data-0:3.28.3-7.fc41.no 100% | 26.4 MiB/s | 2.3 MiB | 00m00s [ 41/232] libmpc-0:1.3.1-6.fc41.x86_64 100% | 11.6 MiB/s | 71.1 KiB | 00m00s [ 42/232] cairo-0:1.18.0-4.fc41.x86_64 100% | 69.3 MiB/s | 709.9 KiB | 00m00s [ 43/232] fontconfig-0:2.15.0-8.fc41.x8 100% | 43.9 MiB/s | 269.9 KiB | 00m00s [ 44/232] freetype-0:2.13.2-6.fc41.x86_ 100% | 57.3 MiB/s | 410.7 KiB | 00m00s [ 45/232] gd-0:2.3.3-17.fc41.x86_64 100% | 33.2 MiB/s | 135.8 KiB | 00m00s [ 46/232] gdk-pixbuf2-0:2.42.12-6.fc41. 100% | 68.2 MiB/s | 489.0 KiB | 00m00s [ 47/232] gcc-0:14.1.1-7.fc41.x86_64 100% | 296.4 MiB/s | 37.1 MiB | 00m00s [ 48/232] glib2-0:2.81.1-1.fc41.x86_64 100% | 74.3 MiB/s | 3.0 MiB | 00m00s [ 49/232] graphviz-libs-0:12.0.0-3.fc41 100% | 50.7 MiB/s | 466.9 KiB | 00m00s [ 50/232] gts-0:0.7.6-49.20121130.fc41. 100% | 117.8 MiB/s | 241.3 KiB | 00m00s [ 51/232] harfbuzz-0:9.0.0-2.fc41.x86_6 100% | 146.0 MiB/s | 1.0 MiB | 00m00s [ 52/232] lasi-0:1.1.3-14.fc41.x86_64 100% | 9.0 MiB/s | 55.6 KiB | 00m00s [ 53/232] libX11-0:1.8.10-1.fc41.x86_64 100% | 105.7 MiB/s | 649.6 KiB | 00m00s [ 54/232] libXrender-0:0.9.11-7.fc41.x8 100% | 3.8 MiB/s | 27.5 KiB | 00m00s [ 55/232] librsvg2-0:2.57.1-8.fc41.x86_ 100% | 125.3 MiB/s | 1.5 MiB | 00m00s [ 56/232] libgs-0:10.03.1-3.fc41.x86_64 100% | 189.4 MiB/s | 3.4 MiB | 00m00s [ 57/232] libwebp-0:1.4.0-4.fc41.x86_64 100% | 57.0 MiB/s | 292.0 KiB | 00m00s [ 58/232] pango-0:1.54.0-2.fc41.x86_64 100% | 169.7 MiB/s | 347.5 KiB | 00m00s [ 59/232] poppler-glib-0:24.02.0-4.fc41 100% | 61.5 MiB/s | 188.9 KiB | 00m00s [ 60/232] urw-base35-fonts-0:20200910-2 100% | 4.9 MiB/s | 10.0 KiB | 00m00s [ 61/232] libcurand-12-6-0:10.3.7.37-1. 100% | 338.6 MiB/s | 52.8 MiB | 00m00s [ 62/232] python3-libs-0:3.13.0~rc1-2.f 100% | 193.5 MiB/s | 9.1 MiB | 00m00s [ 63/232] perl-libs-4:5.40.0-511.fc41.x 100% | 37.0 MiB/s | 2.3 MiB | 00m00s [ 64/232] less-0:661-2.fc41.x86_64 100% | 30.7 MiB/s | 188.5 KiB | 00m00s [ 65/232] openssh-clients-0:9.8p1-3.fc4 100% | 32.9 MiB/s | 741.8 KiB | 00m00s [ 66/232] perl-Carp-0:1.54-511.fc41.noa 100% | 825.5 KiB/s | 28.9 KiB | 00m00s [ 67/232] perl-Exporter-0:5.78-511.fc41 100% | 15.1 MiB/s | 30.9 KiB | 00m00s [ 68/232] perl-Pod-Usage-4:2.03-511.fc4 100% | 13.0 MiB/s | 40.0 KiB | 00m00s [ 69/232] perl-Text-ParseWords-0:3.31-5 100% | 8.1 MiB/s | 16.6 KiB | 00m00s [ 70/232] perl-base-0:2.27-511.fc41.noa 100% | 7.9 MiB/s | 16.2 KiB | 00m00s [ 71/232] perl-constant-0:1.33-512.fc41 100% | 2.5 MiB/s | 23.0 KiB | 00m00s [ 72/232] perl-overload-0:1.37-511.fc41 100% | 5.6 MiB/s | 45.5 KiB | 00m00s [ 73/232] perl-Error-1:0.17029-16.fc41. 100% | 7.9 MiB/s | 40.6 KiB | 00m00s [ 74/232] perl-Fcntl-0:1.18-511.fc41.x8 100% | 14.5 MiB/s | 29.8 KiB | 00m00s [ 75/232] perl-IO-0:1.55-511.fc41.x86_6 100% | 26.6 MiB/s | 81.9 KiB | 00m00s [ 76/232] perl-POSIX-0:2.20-511.fc41.x8 100% | 23.7 MiB/s | 97.0 KiB | 00m00s [ 77/232] perl-Symbol-0:1.09-511.fc41.n 100% | 13.8 MiB/s | 14.2 KiB | 00m00s [ 78/232] perl-Errno-0:1.38-511.fc41.x8 100% | 7.3 MiB/s | 14.9 KiB | 00m00s [ 79/232] perl-Scalar-List-Utils-5:1.65 100% | 23.5 MiB/s | 72.1 KiB | 00m00s [ 80/232] perl-DynaLoader-0:1.56-511.fc 100% | 12.7 MiB/s | 26.0 KiB | 00m00s [ 81/232] perl-vars-0:1.05-511.fc41.noa 100% | 6.3 MiB/s | 13.0 KiB | 00m00s [ 82/232] emacs-filesystem-1:30.0-3.fc4 100% | 2.3 MiB/s | 7.1 KiB | 00m00s [ 83/232] vim-filesystem-2:9.1.660-2.fc 100% | 8.3 MiB/s | 17.0 KiB | 00m00s [ 84/232] cpp-0:14.1.1-7.fc41.x86_64 100% | 265.6 MiB/s | 12.0 MiB | 00m00s [ 85/232] libXext-0:1.3.6-2.fc41.x86_64 100% | 19.1 MiB/s | 39.1 KiB | 00m00s [ 86/232] libpng-2:1.6.40-4.fc41.x86_64 100% | 58.7 MiB/s | 120.3 KiB | 00m00s [ 87/232] libxcb-0:1.17.0-2.fc41.x86_64 100% | 77.8 MiB/s | 239.0 KiB | 00m00s [ 88/232] pixman-0:0.43.4-2.fc41.x86_64 100% | 143.6 MiB/s | 294.1 KiB | 00m00s [ 89/232] default-fonts-core-sans-0:4.1 100% | 30.3 MiB/s | 31.0 KiB | 00m00s [ 90/232] fonts-filesystem-1:2.0.5-17.f 100% | 8.3 MiB/s | 8.5 KiB | 00m00s [ 91/232] xml-common-0:0.6.3-65.fc41.no 100% | 30.5 MiB/s | 31.2 KiB | 00m00s [ 92/232] libXpm-0:3.5.17-4.fc41.x86_64 100% | 64.5 MiB/s | 66.0 KiB | 00m00s [ 93/232] libavif-0:1.0.4-7.fc41.x86_64 100% | 44.6 MiB/s | 91.3 KiB | 00m00s [ 94/232] libimagequant-0:4.0.3-5.fc41. 100% | 147.1 MiB/s | 301.2 KiB | 00m00s [ 95/232] libjpeg-turbo-0:3.0.2-3.fc41. 100% | 111.0 MiB/s | 227.3 KiB | 00m00s [ 96/232] libtiff-0:4.6.0-6.fc42.x86_64 100% | 103.5 MiB/s | 212.0 KiB | 00m00s [ 97/232] shared-mime-info-0:2.3-6.fc41 100% | 127.2 MiB/s | 390.6 KiB | 00m00s [ 98/232] gnutls-0:3.8.7-1.fc42.x86_64 100% | 223.0 MiB/s | 1.1 MiB | 00m00s [ 99/232] netpbm-0:11.02.00-7.fc41.x86_ 100% | 89.9 MiB/s | 184.1 KiB | 00m00s [100/232] graphite2-0:1.3.14-16.fc41.x8 100% | 46.5 MiB/s | 95.1 KiB | 00m00s [101/232] libX11-common-0:1.8.10-1.fc41 100% | 85.8 MiB/s | 175.7 KiB | 00m00s [102/232] adobe-mappings-cmap-0:2023062 100% | 176.7 MiB/s | 2.1 MiB | 00m00s [103/232] adobe-mappings-cmap-deprecate 100% | 53.9 MiB/s | 110.5 KiB | 00m00s [104/232] adobe-mappings-pdf-0:20190401 100% | 122.5 MiB/s | 627.3 KiB | 00m00s [105/232] cups-libs-1:2.4.10-6.fc42.x86 100% | 126.9 MiB/s | 259.8 KiB | 00m00s [106/232] google-droid-sans-fonts-0:202 100% | 180.3 MiB/s | 2.7 MiB | 00m00s [107/232] jbig2dec-libs-0:0.20-5.fc41.x 100% | 36.1 MiB/s | 74.0 KiB | 00m00s [108/232] lcms2-0:2.16-4.fc41.x86_64 100% | 176.1 MiB/s | 180.3 KiB | 00m00s [109/232] libXt-0:1.3.0-4.fc41.x86_64 100% | 86.6 MiB/s | 177.4 KiB | 00m00s [110/232] libijs-0:0.35-23.fc41.x86_64 100% | 14.4 MiB/s | 29.5 KiB | 00m00s [111/232] libpaper-1:2.1.1-7.fc41.x86_6 100% | 26.7 MiB/s | 27.3 KiB | 00m00s [112/232] openjpeg-0:2.5.2-3.fc41.x86_6 100% | 91.2 MiB/s | 186.8 KiB | 00m00s [113/232] cairo-gobject-0:1.18.0-4.fc41 100% | 5.6 MiB/s | 17.3 KiB | 00m00s [114/232] fribidi-0:1.0.15-2.fc41.x86_6 100% | 30.0 MiB/s | 92.2 KiB | 00m00s [115/232] libXft-0:2.3.8-7.fc41.x86_64 100% | 35.3 MiB/s | 72.3 KiB | 00m00s [116/232] libthai-0:0.1.29-9.fc41.x86_6 100% | 68.9 MiB/s | 211.8 KiB | 00m00s [117/232] poppler-0:24.02.0-4.fc41.x86_ 100% | 170.1 MiB/s | 1.2 MiB | 00m00s [118/232] urw-base35-bookman-fonts-0:20 100% | 103.4 MiB/s | 846.8 KiB | 00m00s [119/232] urw-base35-c059-fonts-0:20200 100% | 170.7 MiB/s | 874.0 KiB | 00m00s [120/232] urw-base35-d050000l-fonts-0:2 100% | 73.9 MiB/s | 75.7 KiB | 00m00s [121/232] urw-base35-fonts-common-0:202 100% | 10.1 MiB/s | 20.7 KiB | 00m00s [122/232] urw-base35-gothic-fonts-0:202 100% | 125.5 MiB/s | 642.4 KiB | 00m00s [123/232] urw-base35-nimbus-mono-ps-fon 100% | 155.2 MiB/s | 794.6 KiB | 00m00s [124/232] urw-base35-nimbus-roman-fonts 100% | 119.4 MiB/s | 856.0 KiB | 00m00s [125/232] urw-base35-nimbus-sans-fonts- 100% | 163.2 MiB/s | 1.3 MiB | 00m00s [126/232] urw-base35-p052-fonts-0:20200 100% | 135.8 MiB/s | 973.1 KiB | 00m00s [127/232] urw-base35-standard-symbols-p 100% | 28.4 MiB/s | 58.2 KiB | 00m00s [128/232] urw-base35-z003-fonts-0:20200 100% | 89.7 MiB/s | 275.4 KiB | 00m00s [129/232] libb2-0:0.98.1-12.fc41.x86_64 100% | 25.1 MiB/s | 25.7 KiB | 00m00s [130/232] mpdecimal-0:2.5.1-16.fc41.x86 100% | 43.4 MiB/s | 89.0 KiB | 00m00s [131/232] python-pip-wheel-0:24.2-1.fc4 100% | 150.2 MiB/s | 1.2 MiB | 00m00s [132/232] tzdata-0:2024a-9.fc41.noarch 100% | 139.6 MiB/s | 714.7 KiB | 00m00s [133/232] libedit-0:3.1-53.20240808cvs. 100% | 103.2 MiB/s | 105.6 KiB | 00m00s [134/232] libfido2-0:1.15.0-2.fc41.x86_ 100% | 24.0 MiB/s | 98.1 KiB | 00m00s [135/232] openssh-0:9.8p1-3.fc41.x86_64 100% | 40.5 MiB/s | 414.7 KiB | 00m00s [136/232] perl-Pod-Perldoc-0:3.28.01-51 100% | 16.8 MiB/s | 86.1 KiB | 00m00s [137/232] perl-podlators-1:6.0.2-2.fc41 100% | 41.9 MiB/s | 128.8 KiB | 00m00s [138/232] perl-mro-0:1.29-511.fc41.x86_ 100% | 9.7 MiB/s | 29.9 KiB | 00m00s [139/232] perl-overloading-0:0.02-511.f 100% | 6.3 MiB/s | 12.9 KiB | 00m00s [140/232] perl-File-stat-0:1.14-511.fc4 100% | 8.3 MiB/s | 17.0 KiB | 00m00s [141/232] perl-SelectSaver-0:1.02-511.f 100% | 5.7 MiB/s | 11.7 KiB | 00m00s [142/232] perl-Socket-4:2.038-511.fc41. 100% | 17.8 MiB/s | 54.8 KiB | 00m00s [143/232] perl-locale-0:1.12-511.fc41.n 100% | 6.6 MiB/s | 13.6 KiB | 00m00s [144/232] libXau-0:1.0.11-7.fc41.x86_64 100% | 31.1 MiB/s | 31.9 KiB | 00m00s [145/232] abattis-cantarell-vf-fonts-0: 100% | 58.7 MiB/s | 120.2 KiB | 00m00s [146/232] google-noto-sans-vf-fonts-0:2 100% | 145.0 MiB/s | 594.1 KiB | 00m00s [147/232] libaom-0:3.9.0-3.fc41.x86_64 100% | 183.3 MiB/s | 1.8 MiB | 00m00s [148/232] libdav1d-0:1.4.3-2.fc41.x86_6 100% | 87.3 MiB/s | 625.6 KiB | 00m00s [149/232] libcublas-12-6-0:12.6.0.22-1. 100% | 335.5 MiB/s | 350.6 MiB | 00m01s [150/232] rav1e-libs-0:0.7.1-3.fc41.x86 100% | 3.2 MiB/s | 1.0 MiB | 00m00s [151/232] svt-av1-libs-0:2.1.0-2.fc41.x 100% | 225.8 MiB/s | 2.0 MiB | 00m00s [152/232] jbigkit-libs-0:2.1-30.fc41.x8 100% | 7.4 MiB/s | 53.3 KiB | 00m00s [153/232] liblerc-0:4.0.0-7.fc41.x86_64 100% | 102.7 MiB/s | 210.3 KiB | 00m00s [154/232] nettle-0:3.10-3.fc41.x86_64 100% | 139.5 MiB/s | 428.5 KiB | 00m00s [155/232] avahi-libs-0:0.8-29.fc41.x86_ 100% | 32.6 MiB/s | 66.7 KiB | 00m00s [156/232] libICE-0:1.1.1-4.fc41.x86_64 100% | 36.5 MiB/s | 74.7 KiB | 00m00s [157/232] libSM-0:1.2.4-4.fc41.x86_64 100% | 42.2 MiB/s | 43.2 KiB | 00m00s [158/232] libdatrie-0:0.2.13-10.fc41.x8 100% | 15.7 MiB/s | 32.2 KiB | 00m00s [159/232] gpgmepp-0:1.23.2-5.fc41.x86_6 100% | 67.4 MiB/s | 138.1 KiB | 00m00s [160/232] nspr-0:4.35.0-28.fc41.x86_64 100% | 67.1 MiB/s | 137.5 KiB | 00m00s [161/232] nss-0:3.103.0-1.fc41.x86_64 100% | 171.9 MiB/s | 704.3 KiB | 00m00s [162/232] poppler-data-0:0.4.11-8.fc41. 100% | 246.8 MiB/s | 2.0 MiB | 00m00s [163/232] libcbor-0:0.11.0-2.fc41.x86_6 100% | 4.6 MiB/s | 33.1 KiB | 00m00s [164/232] perl-File-Temp-1:0.231.100-51 100% | 14.4 MiB/s | 59.1 KiB | 00m00s [165/232] perl-HTTP-Tiny-0:0.088-512.fc 100% | 13.6 MiB/s | 55.8 KiB | 00m00s [166/232] perl-Pod-Simple-1:3.45-511.fc 100% | 16.5 MiB/s | 219.0 KiB | 00m00s [167/232] groff-base-0:1.23.0-7.fc41.x8 100% | 44.0 MiB/s | 1.1 MiB | 00m00s [168/232] perl-parent-1:0.242-1.fc42.no 100% | 4.9 MiB/s | 15.0 KiB | 00m00s [169/232] perl-Term-ANSIColor-0:5.01-51 100% | 15.5 MiB/s | 47.7 KiB | 00m00s [170/232] perl-Term-Cap-0:1.18-511.fc41 100% | 7.2 MiB/s | 22.1 KiB | 00m00s [171/232] perl-Class-Struct-0:0.68-511. 100% | 10.8 MiB/s | 22.0 KiB | 00m00s [172/232] google-noto-fonts-common-0:20 100% | 8.8 MiB/s | 18.0 KiB | 00m00s [173/232] libvmaf-0:3.0.0-2.fc41.x86_64 100% | 94.6 MiB/s | 193.7 KiB | 00m00s [174/232] dbus-libs-1:1.14.10-4.fc41.x8 100% | 151.4 MiB/s | 155.1 KiB | 00m00s [175/232] gpgme-0:1.23.2-5.fc41.x86_64 100% | 103.7 MiB/s | 212.4 KiB | 00m00s [176/232] libassuan-0:2.5.7-2.fc41.x86_ 100% | 32.8 MiB/s | 67.1 KiB | 00m00s [177/232] nss-softokn-0:3.103.0-1.fc41. 100% | 200.5 MiB/s | 410.7 KiB | 00m00s [178/232] nss-sysinit-0:3.103.0-1.fc41. 100% | 18.7 MiB/s | 19.2 KiB | 00m00s [179/232] nss-util-0:3.103.0-1.fc41.x86 100% | 84.1 MiB/s | 86.1 KiB | 00m00s [180/232] perl-File-Path-0:2.18-511.fc4 100% | 11.5 MiB/s | 35.3 KiB | 00m00s [181/232] perl-MIME-Base64-0:3.16-511.f 100% | 14.6 MiB/s | 29.9 KiB | 00m00s [182/232] perl-Mozilla-CA-0:20240730-1. 100% | 7.0 MiB/s | 14.3 KiB | 00m00s [183/232] perl-IO-Socket-SSL-0:2.088-2. 100% | 25.0 MiB/s | 230.4 KiB | 00m00s [184/232] perl-Time-Local-2:1.350-511.f 100% | 11.2 MiB/s | 34.5 KiB | 00m00s [185/232] perl-Pod-Escapes-1:1.07-511.f 100% | 9.7 MiB/s | 19.8 KiB | 00m00s [186/232] perl-Net-SSLeay-0:1.94-7.fc41 100% | 36.7 MiB/s | 375.7 KiB | 00m00s [187/232] perl-Text-Tabs+Wrap-0:2024.00 100% | 10.7 MiB/s | 21.9 KiB | 00m00s [188/232] perl-if-0:0.61.000-511.fc41.n 100% | 13.6 MiB/s | 14.0 KiB | 00m00s [189/232] gnupg2-0:2.4.5-3.fc41.x86_64 100% | 298.1 MiB/s | 2.7 MiB | 00m00s [190/232] ncurses-0:6.5-2.20240629.fc41 100% | 31.8 MiB/s | 423.8 KiB | 00m00s [191/232] libgpg-error-0:1.50-2.fc41.x8 100% | 77.3 MiB/s | 237.5 KiB | 00m00s [192/232] nss-softokn-freebl-0:3.103.0- 100% | 73.7 MiB/s | 301.8 KiB | 00m00s [193/232] perl-IO-Socket-IP-0:0.42-512. 100% | 10.2 MiB/s | 41.8 KiB | 00m00s [194/232] perl-URI-0:5.28-2.fc41.noarch 100% | 32.4 MiB/s | 132.6 KiB | 00m00s [195/232] libgcrypt-0:1.11.0-3.fc41.x86 100% | 188.3 MiB/s | 578.5 KiB | 00m00s [196/232] perl-AutoLoader-0:5.74-511.fc 100% | 4.1 MiB/s | 21.2 KiB | 00m00s [197/232] libksba-0:1.6.7-2.fc41.x86_64 100% | 78.0 MiB/s | 159.7 KiB | 00m00s [198/232] npth-0:1.7-2.fc41.x86_64 100% | 12.3 MiB/s | 25.2 KiB | 00m00s [199/232] libcudnn8-0:8.9.7.29-2.cuda12 100% | 214.8 MiB/s | 446.6 MiB | 00m02s [200/232] tpm2-tss-0:4.1.3-3.fc41.x86_6 100% | 1.0 MiB/s | 411.5 KiB | 00m00s [201/232] perl-Data-Dumper-0:2.189-512. 100% | 140.2 KiB/s | 56.3 KiB | 00m00s [202/232] json-c-0:0.17-4.fc41.x86_64 100% | 14.3 MiB/s | 44.0 KiB | 00m00s [203/232] perl-libnet-0:3.15-512.fc41.n 100% | 25.1 MiB/s | 128.5 KiB | 00m00s [204/232] perl-FileHandle-0:2.05-511.fc 100% | 7.5 MiB/s | 15.5 KiB | 00m00s [205/232] perl-B-0:1.89-511.fc41.x86_64 100% | 21.5 MiB/s | 176.3 KiB | 00m00s [206/232] perl-Digest-MD5-0:2.59-5.fc41 100% | 8.8 MiB/s | 36.0 KiB | 00m00s [207/232] perl-Digest-0:1.20-511.fc41.n 100% | 6.1 MiB/s | 24.9 KiB | 00m00s [208/232] annobin-plugin-gcc-0:12.70-1. 100% | 237.2 MiB/s | 971.7 KiB | 00m00s [209/232] gcc-plugin-annobin-0:14.1.1-7 100% | 51.2 MiB/s | 52.4 KiB | 00m00s [210/232] annobin-docs-0:12.70-1.fc42.n 100% | 44.9 MiB/s | 92.0 KiB | 00m00s [211/232] pyproject-rpm-macros-0:1.14.0 100% | 10.4 MiB/s | 42.6 KiB | 00m00s [212/232] python-rpm-macros-0:3.13-3.fc 100% | 8.6 MiB/s | 17.7 KiB | 00m00s [213/232] python3-rpm-generators-0:14-1 100% | 14.3 MiB/s | 29.3 KiB | 00m00s [214/232] python3-rpm-macros-0:3.13-3.f 100% | 6.1 MiB/s | 12.4 KiB | 00m00s [215/232] cmake-rpm-macros-0:3.28.3-7.f 100% | 8.7 MiB/s | 17.7 KiB | 00m00s [216/232] python3-packaging-0:24.1-2.fc 100% | 61.2 MiB/s | 125.4 KiB | 00m00s [217/232] cuda-toolkit-12-6-config-comm 100% | 2.5 MiB/s | 7.7 KiB | 00m00s [218/232] cuda-toolkit-12-config-common 100% | 262.7 KiB/s | 7.9 KiB | 00m00s [219/232] cuda-toolkit-config-common-0: 100% | 2.6 MiB/s | 7.9 KiB | 00m00s [220/232] cuda-gcc-11-c++-0:11.2.1-1.fc 100% | 160.2 MiB/s | 13.5 MiB | 00m00s [221/232] isl-0:0.16.1-21.fc41.x86_64 100% | 48.8 MiB/s | 850.2 KiB | 00m00s [222/232] python3-0:3.13.0~rc1-2.fc41.x 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [223/232] perl-Getopt-Std-0:1.14-511.fc 100% | 7.6 MiB/s | 15.6 KiB | 00m00s [224/232] perl-Storable-1:3.32-511.fc41 100% | 13.7 MiB/s | 98.4 KiB | 00m00s [225/232] rsvg-pixbuf-loader-0:2.57.1-8 100% | 2.2 MiB/s | 16.1 KiB | 00m00s [226/232] cuda-gcc-11-0:11.2.1-1.fc39.x 100% | 215.6 MiB/s | 30.0 MiB | 00m00s [227/232] libstdc++-devel-0:14.1.1-7.fc 100% | 45.1 MiB/s | 2.7 MiB | 00m00s [228/232] perl-Encode-4:3.21-511.fc41.x 100% | 13.2 MiB/s | 1.1 MiB | 00m00s [229/232] glibc-devel-0:2.40-3.fc41.x86 100% | 5.0 MiB/s | 133.3 KiB | 00m00s [230/232] glibc-headers-x86-0:2.40-3.fc 100% | 102.7 MiB/s | 630.7 KiB | 00m00s [231/232] libxcrypt-devel-0:4.4.36-7.fc 100% | 7.0 MiB/s | 28.9 KiB | 00m00s [232/232] kernel-headers-0:6.11.0-0.rc3 100% | 147.7 MiB/s | 1.6 MiB | 00m00s -------------------------------------------------------------------------------- [232/232] Total 100% | 460.8 MiB/s | 1.2 GiB | 00m03s Running transaction [ 1/234] Verify package files 100% | 50.0 B/s | 232.0 B | 00m05s [ 2/234] Prepare transaction 100% | 2.4 KiB/s | 232.0 B | 00m00s [ 3/234] Installing libpng-2:1.6.40-4. 100% | 241.3 MiB/s | 247.1 KiB | 00m00s [ 4/234] Installing nspr-0:4.35.0-28.f 100% | 157.3 MiB/s | 322.1 KiB | 00m00s [ 5/234] Installing libgpg-error-0:1.5 100% | 218.6 MiB/s | 895.4 KiB | 00m00s [ 6/234] Installing libjpeg-turbo-0:3. 100% | 380.2 MiB/s | 778.7 KiB | 00m00s [ 7/234] Installing fonts-filesystem-1 100% | 0.0 B/s | 788.0 B | 00m00s [ 8/234] Installing urw-base35-fonts-c 100% | 0.0 B/s | 38.4 KiB | 00m00s [ 9/234] Installing expat-0:2.6.2-2.fc 100% | 280.1 MiB/s | 286.8 KiB | 00m00s [ 10/234] Installing nss-util-0:3.103.0 100% | 201.3 MiB/s | 206.1 KiB | 00m00s [ 11/234] Installing libwebp-0:1.4.0-4. 100% | 269.1 MiB/s | 826.8 KiB | 00m00s [ 12/234] Installing libmpc-0:1.3.1-6.f 100% | 162.3 MiB/s | 166.2 KiB | 00m00s [ 13/234] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB | 00m00s [ 14/234] Installing cuda-toolkit-confi 100% | 0.0 B/s | 312.0 B | 00m00s [ 15/234] Installing cuda-toolkit-12-co 100% | 0.0 B/s | 316.0 B | 00m00s [ 16/234] Installing cuda-toolkit-12-6- 100% | 0.0 B/s | 124.0 B | 00m00s [ 17/234] Installing python-rpm-macros- 100% | 0.0 B/s | 22.8 KiB | 00m00s [ 18/234] Installing python3-rpm-macros 100% | 0.0 B/s | 6.7 KiB | 00m00s [ 19/234] Installing libICE-0:1.1.1-4.f 100% | 178.3 MiB/s | 182.6 KiB | 00m00s [ 20/234] Installing openjpeg-0:2.5.2-3 100% | 218.6 MiB/s | 447.7 KiB | 00m00s [ 21/234] Installing lcms2-0:2.16-4.fc4 100% | 208.3 MiB/s | 426.5 KiB | 00m00s [ 22/234] Installing adobe-mappings-cma 100% | 351.6 MiB/s | 14.4 MiB | 00m00s [ 23/234] Installing make-1:4.4.1-8.fc4 100% | 300.0 MiB/s | 1.8 MiB | 00m00s [ 24/234] Installing cmake-filesystem-0 100% | 7.0 MiB/s | 7.1 KiB | 00m00s [ 25/234] Installing adobe-mappings-cma 100% | 285.7 MiB/s | 585.2 KiB | 00m00s [ 26/234] Installing libSM-0:1.2.4-4.fc 100% | 96.3 MiB/s | 98.7 KiB | 00m00s [ 27/234] Installing pyproject-rpm-macr 100% | 105.1 MiB/s | 107.7 KiB | 00m00s [ 28/234] Installing cuda-cudart-12-6-0 100% | 67.7 MiB/s | 762.4 KiB | 00m00s >>> Running post-install scriptlet: cuda-cudart-12-6-0:12.6.37-1.x86_64 >>> Stop post-install scriptlet: cuda-cudart-12-6-0:12.6.37-1.x86_64 [ 29/234] Installing libcublas-12-6-0:1 100% | 183.3 MiB/s | 533.1 MiB | 00m03s >>> Running post-install scriptlet: libcublas-12-6-0:12.6.0.22-1.x86_64 >>> Stop post-install scriptlet: libcublas-12-6-0:12.6.0.22-1.x86_64 [ 30/234] Installing libcurand-12-6-0:1 100% | 308.9 MiB/s | 92.1 MiB | 00m00s >>> Running post-install scriptlet: libcurand-12-6-0:10.3.7.37-1.x86_64 >>> Stop post-install scriptlet: libcurand-12-6-0:10.3.7.37-1.x86_64 [ 31/234] Installing cpp-0:14.1.1-7.fc4 100% | 327.1 MiB/s | 35.0 MiB | 00m00s [ 32/234] Installing cuda-gcc-11-0:11.2 100% | 354.1 MiB/s | 95.9 MiB | 00m00s [ 33/234] Installing nss-softokn-freebl 100% | 255.8 MiB/s | 785.8 KiB | 00m00s [ 34/234] Installing nss-softokn-0:3.10 100% | 375.8 MiB/s | 1.9 MiB | 00m00s [ 35/234] Installing nss-sysinit-0:3.10 100% | 22.7 MiB/s | 23.3 KiB | 00m00s [ 36/234] Installing nss-0:3.103.0-1.fc 100% | 171.1 MiB/s | 1.9 MiB | 00m00s >>> Running post-install scriptlet: nss-0:3.103.0-1.fc41.x86_64 >>> Stop post-install scriptlet: nss-0:3.103.0-1.fc41.x86_64 [ 37/234] Installing graphviz-libs-0:12 100% | 244.2 MiB/s | 1.2 MiB | 00m00s [ 38/234] Installing urw-base35-bookman 100% | 105.0 MiB/s | 1.4 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-bookman-fonts-0:20200910-23.fc41. >>> Stop post-install scriptlet: urw-base35-bookman-fonts-0:20200910-23.fc41.noa [ 39/234] Installing urw-base35-c059-fo 100% | 139.5 MiB/s | 1.4 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-c059-fonts-0:20200910-23.fc41.noa >>> Stop post-install scriptlet: urw-base35-c059-fonts-0:20200910-23.fc41.noarch [ 40/234] Installing urw-base35-d050000 100% | 16.7 MiB/s | 85.4 KiB | 00m00s >>> Running post-install scriptlet: urw-base35-d050000l-fonts-0:20200910-23.fc41 >>> Stop post-install scriptlet: urw-base35-d050000l-fonts-0:20200910-23.fc41.no [ 41/234] Installing urw-base35-gothic- 100% | 145.4 MiB/s | 1.2 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-gothic-fonts-0:20200910-23.fc41.n >>> Stop post-install scriptlet: urw-base35-gothic-fonts-0:20200910-23.fc41.noar [ 42/234] Installing urw-base35-nimbus- 100% | 150.3 MiB/s | 1.1 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-nimbus-mono-ps-fonts-0:20200910-2 >>> Stop post-install scriptlet: urw-base35-nimbus-mono-ps-fonts-0:20200910-23.f [ 43/234] Installing urw-base35-nimbus- 100% | 151.8 MiB/s | 1.4 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-nimbus-roman-fonts-0:20200910-23. >>> Stop post-install scriptlet: urw-base35-nimbus-roman-fonts-0:20200910-23.fc4 [ 44/234] Installing urw-base35-nimbus- 100% | 217.7 MiB/s | 2.4 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-nimbus-sans-fonts-0:20200910-23.f >>> Stop post-install scriptlet: urw-base35-nimbus-sans-fonts-0:20200910-23.fc41 [ 45/234] Installing urw-base35-p052-fo 100% | 165.3 MiB/s | 1.5 MiB | 00m00s >>> Running post-install scriptlet: urw-base35-p052-fonts-0:20200910-23.fc41.noa >>> Stop post-install scriptlet: urw-base35-p052-fonts-0:20200910-23.fc41.noarch [ 46/234] Installing urw-base35-standar 100% | 12.9 MiB/s | 66.0 KiB | 00m00s >>> Running post-install scriptlet: urw-base35-standard-symbols-ps-fonts-0:20200 >>> Stop post-install scriptlet: urw-base35-standard-symbols-ps-fonts-0:20200910 [ 47/234] Installing urw-base35-z003-fo 100% | 63.8 MiB/s | 391.8 KiB | 00m00s >>> Running post-install scriptlet: urw-base35-z003-fonts-0:20200910-23.fc41.noa >>> Stop post-install scriptlet: urw-base35-z003-fonts-0:20200910-23.fc41.noarch [ 48/234] Installing urw-base35-fonts-0 100% | 5.5 MiB/s | 5.6 KiB | 00m00s [ 49/234] Installing google-droid-sans- 100% | 329.4 MiB/s | 6.3 MiB | 00m00s [ 50/234] Installing abattis-cantarell- 100% | 189.9 MiB/s | 194.4 KiB | 00m00s [ 51/234] Installing libgcrypt-0:1.11.0 100% | 385.5 MiB/s | 1.5 MiB | 00m00s [ 52/234] Installing libksba-0:1.6.7-2. 100% | 97.9 MiB/s | 401.0 KiB | 00m00s [ 53/234] Installing kernel-headers-0:6 100% | 204.5 MiB/s | 6.5 MiB | 00m00s [ 54/234] Installing glibc-headers-x86- 100% | 190.5 MiB/s | 2.3 MiB | 00m00s [ 55/234] Installing libxcrypt-devel-0: 100% | 31.8 MiB/s | 32.6 KiB | 00m00s [ 56/234] Installing glibc-devel-0:2.40 100% | 18.7 MiB/s | 38.4 KiB | 00m00s [ 57/234] Installing gcc-0:14.1.1-7.fc4 100% | 365.5 MiB/s | 104.2 MiB | 00m00s >>> Running trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch >>> Stop trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch [ 58/234] Installing libstdc++-devel-0: 100% | 353.4 MiB/s | 15.6 MiB | 00m00s [ 59/234] Installing gcc-c++-0:14.1.1-7 100% | 353.3 MiB/s | 38.2 MiB | 00m00s [ 60/234] Installing isl-0:0.16.1-21.fc 100% | 381.5 MiB/s | 3.1 MiB | 00m00s [ 61/234] Installing annobin-docs-0:12. 100% | 96.5 MiB/s | 98.8 KiB | 00m00s [ 62/234] Installing json-c-0:0.17-4.fc 100% | 4.8 MiB/s | 83.6 KiB | 00m00s >>> Running pre-install scriptlet: tpm2-tss-0:4.1.3-3.fc41.x86_64 >>> Stop pre-install scriptlet: tpm2-tss-0:4.1.3-3.fc41.x86_64 [ 63/234] Installing tpm2-tss-0:4.1.3-3 100% | 225.8 MiB/s | 1.6 MiB | 00m00s [ 64/234] Installing npth-0:1.7-2.fc41. 100% | 49.5 MiB/s | 50.7 KiB | 00m00s [ 65/234] Installing ncurses-0:6.5-2.20 100% | 206.3 MiB/s | 633.9 KiB | 00m00s [ 66/234] Installing dbus-libs-1:1.14.1 100% | 361.4 MiB/s | 370.0 KiB | 00m00s [ 67/234] Installing avahi-libs-0:0.8-2 100% | 164.9 MiB/s | 168.9 KiB | 00m00s [ 68/234] Installing libvmaf-0:3.0.0-2. 100% | 402.5 MiB/s | 824.4 KiB | 00m00s [ 69/234] Installing libaom-0:3.9.0-3.f 100% | 337.4 MiB/s | 5.1 MiB | 00m00s [ 70/234] Installing google-noto-fonts- 100% | 0.0 B/s | 18.3 KiB | 00m00s [ 71/234] Installing google-noto-sans-v 100% | 312.2 MiB/s | 1.2 MiB | 00m00s [ 72/234] Installing default-fonts-core 100% | 2.2 MiB/s | 18.2 KiB | 00m00s >>> Running pre-install scriptlet: groff-base-0:1.23.0-7.fc41.x86_64 >>> Stop pre-install scriptlet: groff-base-0:1.23.0-7.fc41.x86_64 [ 73/234] Installing groff-base-0:1.23. 100% | 193.7 MiB/s | 3.9 MiB | 00m00s >>> Running post-install scriptlet: groff-base-0:1.23.0-7.fc41.x86_64 >>> Stop post-install scriptlet: groff-base-0:1.23.0-7.fc41.x86_64 [ 74/234] Installing perl-Digest-0:1.20 100% | 36.2 MiB/s | 37.1 KiB | 00m00s [ 75/234] Installing perl-B-0:1.89-511. 100% | 244.8 MiB/s | 501.3 KiB | 00m00s [ 76/234] Installing perl-FileHandle-0: 100% | 0.0 B/s | 9.8 KiB | 00m00s [ 77/234] Installing perl-Digest-MD5-0: 100% | 60.2 MiB/s | 61.7 KiB | 00m00s [ 78/234] Installing perl-Data-Dumper-0 100% | 110.9 MiB/s | 113.6 KiB | 00m00s [ 79/234] Installing perl-libnet-0:3.15 100% | 143.9 MiB/s | 294.7 KiB | 00m00s [ 80/234] Installing perl-IO-Socket-IP- 100% | 98.1 MiB/s | 100.5 KiB | 00m00s [ 81/234] Installing perl-AutoLoader-0: 100% | 0.0 B/s | 20.9 KiB | 00m00s [ 82/234] Installing perl-URI-0:5.28-2. 100% | 82.1 MiB/s | 252.3 KiB | 00m00s [ 83/234] Installing perl-locale-0:1.12 100% | 0.0 B/s | 6.9 KiB | 00m00s [ 84/234] Installing perl-File-Path-0:2 100% | 63.0 MiB/s | 64.5 KiB | 00m00s [ 85/234] Installing perl-Mozilla-CA-0: 100% | 0.0 B/s | 10.8 KiB | 00m00s [ 86/234] Installing perl-Time-Local-2: 100% | 68.9 MiB/s | 70.6 KiB | 00m00s [ 87/234] Installing perl-Pod-Escapes-1 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 88/234] Installing perl-Text-Tabs+Wra 100% | 0.0 B/s | 23.9 KiB | 00m00s [ 89/234] Installing perl-if-0:0.61.000 100% | 6.1 MiB/s | 6.2 KiB | 00m00s [ 90/234] Installing perl-Net-SSLeay-0: 100% | 227.1 MiB/s | 1.4 MiB | 00m00s [ 91/234] Installing perl-IO-Socket-SSL 100% | 229.6 MiB/s | 705.2 KiB | 00m00s [ 92/234] Installing perl-POSIX-0:2.20- 100% | 230.8 MiB/s | 236.4 KiB | 00m00s [ 93/234] Installing perl-Term-ANSIColo 100% | 96.9 MiB/s | 99.2 KiB | 00m00s [ 94/234] Installing perl-Term-Cap-0:1. 100% | 0.0 B/s | 30.6 KiB | 00m00s [ 95/234] Installing perl-IPC-Open3-0:1 100% | 0.0 B/s | 23.3 KiB | 00m00s [ 96/234] Installing perl-Class-Struct- 100% | 0.0 B/s | 25.9 KiB | 00m00s [ 97/234] Installing perl-File-Temp-1:0 100% | 160.2 MiB/s | 164.1 KiB | 00m00s [ 98/234] Installing perl-Pod-Simple-1: 100% | 278.5 MiB/s | 570.5 KiB | 00m00s [ 99/234] Installing perl-HTTP-Tiny-0:0 100% | 150.6 MiB/s | 154.2 KiB | 00m00s [100/234] Installing perl-Symbol-0:1.09 100% | 0.0 B/s | 7.2 KiB | 00m00s [101/234] Installing perl-SelectSaver-0 100% | 0.0 B/s | 2.6 KiB | 00m00s [102/234] Installing perl-Socket-4:2.03 100% | 123.1 MiB/s | 126.1 KiB | 00m00s [103/234] Installing perl-File-stat-0:1 100% | 0.0 B/s | 13.1 KiB | 00m00s [104/234] Installing perl-podlators-1:6 100% | 157.0 MiB/s | 321.4 KiB | 00m00s [105/234] Installing perl-Pod-Perldoc-0 100% | 165.3 MiB/s | 169.3 KiB | 00m00s [106/234] Installing perl-Text-ParseWor 100% | 0.0 B/s | 14.6 KiB | 00m00s [107/234] Installing perl-base-0:2.27-5 100% | 0.0 B/s | 12.9 KiB | 00m00s [108/234] Installing perl-Fcntl-0:1.18- 100% | 0.0 B/s | 50.1 KiB | 00m00s [109/234] Installing perl-mro-0:1.29-51 100% | 45.6 MiB/s | 46.7 KiB | 00m00s [110/234] Installing perl-overloading-0 100% | 0.0 B/s | 5.5 KiB | 00m00s [111/234] Installing perl-IO-0:1.55-511 100% | 151.7 MiB/s | 155.3 KiB | 00m00s [112/234] Installing perl-Pod-Usage-4:2 100% | 84.3 MiB/s | 86.3 KiB | 00m00s [113/234] Installing perl-File-Basename 100% | 0.0 B/s | 14.6 KiB | 00m00s [114/234] Installing perl-constant-0:1. 100% | 26.7 MiB/s | 27.4 KiB | 00m00s [115/234] Installing perl-Errno-0:1.38- 100% | 0.0 B/s | 8.8 KiB | 00m00s [116/234] Installing perl-Scalar-List-U 100% | 146.2 MiB/s | 149.7 KiB | 00m00s [117/234] Installing perl-vars-0:1.05-5 100% | 0.0 B/s | 4.3 KiB | 00m00s [118/234] Installing perl-overload-0:1. 100% | 0.0 B/s | 71.9 KiB | 00m00s [119/234] Installing perl-parent-1:0.24 100% | 0.0 B/s | 10.7 KiB | 00m00s [120/234] Installing perl-MIME-Base64-0 100% | 47.2 MiB/s | 48.4 KiB | 00m00s [121/234] Installing perl-Getopt-Std-0: 100% | 0.0 B/s | 11.7 KiB | 00m00s [122/234] Installing perl-Storable-1:3. 100% | 228.5 MiB/s | 234.0 KiB | 00m00s [123/234] Installing perl-Getopt-Long-1 100% | 143.8 MiB/s | 147.2 KiB | 00m00s [124/234] Installing perl-Carp-0:1.54-5 100% | 0.0 B/s | 47.7 KiB | 00m00s [125/234] Installing perl-Exporter-0:5. 100% | 54.3 MiB/s | 55.6 KiB | 00m00s [126/234] Installing perl-PathTools-0:3 100% | 180.2 MiB/s | 184.6 KiB | 00m00s [127/234] Installing perl-DynaLoader-0: 100% | 0.0 B/s | 32.5 KiB | 00m00s [128/234] Installing perl-Encode-4:3.21 100% | 314.6 MiB/s | 4.7 MiB | 00m00s [129/234] Installing perl-libs-4:5.40.0 100% | 262.1 MiB/s | 10.0 MiB | 00m00s [130/234] Installing perl-interpreter-4 100% | 121.1 MiB/s | 124.0 KiB | 00m00s [131/234] Installing perl-File-Find-0:1 100% | 0.0 B/s | 42.5 KiB | 00m00s [132/234] Installing perl-TermReadKey-0 100% | 64.7 MiB/s | 66.3 KiB | 00m00s [133/234] Installing perl-lib-0:0.65-51 100% | 0.0 B/s | 8.9 KiB | 00m00s [134/234] Installing perl-Error-1:0.170 100% | 78.6 MiB/s | 80.5 KiB | 00m00s [135/234] Installing libcbor-0:0.11.0-2 100% | 73.5 MiB/s | 75.3 KiB | 00m00s [136/234] Installing libfido2-0:1.15.0- 100% | 117.1 MiB/s | 239.7 KiB | 00m00s [137/234] Installing poppler-data-0:0.4 100% | 387.2 MiB/s | 12.4 MiB | 00m00s [138/234] Installing libdatrie-0:0.2.13 100% | 0.0 B/s | 59.0 KiB | 00m00s [139/234] Installing libthai-0:0.1.29-9 100% | 255.6 MiB/s | 785.3 KiB | 00m00s [140/234] Installing nettle-0:3.10-3.fc 100% | 388.7 MiB/s | 796.1 KiB | 00m00s [141/234] Installing gnutls-0:3.8.7-1.f 100% | 323.7 MiB/s | 3.2 MiB | 00m00s [142/234] Installing glib2-0:2.81.1-1.f 100% | 375.4 MiB/s | 14.6 MiB | 00m00s [143/234] Installing shared-mime-info-0 100% | 196.6 MiB/s | 2.6 MiB | 00m00s >>> Running post-install scriptlet: shared-mime-info-0:2.3-6.fc41.x86_64 >>> Stop post-install scriptlet: shared-mime-info-0:2.3-6.fc41.x86_64 [144/234] Installing gdk-pixbuf2-0:2.42 100% | 229.6 MiB/s | 2.5 MiB | 00m00s [145/234] Installing cups-libs-1:2.4.10 100% | 203.3 MiB/s | 624.4 KiB | 00m00s [146/234] Installing gnupg2-0:2.4.5-3.f 100% | 340.8 MiB/s | 9.5 MiB | 00m00s [147/234] Installing gpgme-0:1.23.2-5.f 100% | 284.0 MiB/s | 581.7 KiB | 00m00s [148/234] Installing gpgmepp-0:1.23.2-5 100% | 207.7 MiB/s | 425.4 KiB | 00m00s [149/234] Installing liblerc-0:4.0.0-7. 100% | 297.4 MiB/s | 609.0 KiB | 00m00s [150/234] Installing jbigkit-libs-0:2.1 100% | 116.8 MiB/s | 119.6 KiB | 00m00s [151/234] Installing libtiff-0:4.6.0-6. 100% | 297.0 MiB/s | 608.2 KiB | 00m00s [152/234] Installing svt-av1-libs-0:2.1 100% | 395.6 MiB/s | 7.1 MiB | 00m00s [153/234] Installing rav1e-libs-0:0.7.1 100% | 374.6 MiB/s | 3.0 MiB | 00m00s [154/234] Installing libdav1d-0:1.4.3-2 100% | 336.4 MiB/s | 1.7 MiB | 00m00s [155/234] Installing libavif-0:1.0.4-7. 100% | 180.8 MiB/s | 185.1 KiB | 00m00s [156/234] Installing libXau-0:1.0.11-7. 100% | 66.8 MiB/s | 68.4 KiB | 00m00s [157/234] Installing libxcb-0:1.17.0-2. 100% | 279.1 MiB/s | 1.1 MiB | 00m00s [158/234] Installing openssh-0:9.8p1-3. 100% | 444.6 MiB/s | 1.8 MiB | 00m00s [159/234] Installing libedit-0:3.1-53.2 100% | 240.0 MiB/s | 245.8 KiB | 00m00s [160/234] Installing openssh-clients-0: 100% | 173.3 MiB/s | 2.6 MiB | 00m00s >>> Running post-install scriptlet: openssh-clients-0:9.8p1-3.fc41.x86_64 >>> Stop post-install scriptlet: openssh-clients-0:9.8p1-3.fc41.x86_64 [161/234] Installing tzdata-0:2024a-9.f 100% | 64.7 MiB/s | 1.9 MiB | 00m00s [162/234] Installing python-pip-wheel-0 100% | 620.8 MiB/s | 1.2 MiB | 00m00s [163/234] Installing mpdecimal-0:2.5.1- 100% | 201.2 MiB/s | 206.0 KiB | 00m00s [164/234] Installing libb2-0:0.98.1-12. 100% | 8.5 MiB/s | 43.3 KiB | 00m00s [165/234] Installing python3-libs-0:3.1 100% | 294.8 MiB/s | 40.7 MiB | 00m00s [166/234] Installing python3-0:3.13.0~r 100% | 32.8 MiB/s | 33.6 KiB | 00m00s [167/234] Installing cmake-rpm-macros-0 100% | 0.0 B/s | 8.1 KiB | 00m00s [168/234] Installing python3-packaging- 100% | 211.5 MiB/s | 433.2 KiB | 00m00s [169/234] Installing python3-rpm-genera 100% | 0.0 B/s | 82.9 KiB | 00m00s [170/234] Installing fribidi-0:1.0.15-2 100% | 181.1 MiB/s | 370.9 KiB | 00m00s [171/234] Installing libpaper-1:2.1.1-7 100% | 49.3 MiB/s | 50.5 KiB | 00m00s [172/234] Installing libijs-0:0.35-23.f 100% | 0.0 B/s | 62.6 KiB | 00m00s [173/234] Installing jbig2dec-libs-0:0. 100% | 166.6 MiB/s | 170.6 KiB | 00m00s [174/234] Installing adobe-mappings-pdf 100% | 399.7 MiB/s | 4.4 MiB | 00m00s [175/234] Installing libX11-common-0:1. 100% | 148.4 MiB/s | 1.2 MiB | 00m00s [176/234] Installing libX11-0:1.8.10-1. 100% | 321.4 MiB/s | 1.3 MiB | 00m00s [177/234] Installing libXrender-0:0.9.1 100% | 0.0 B/s | 51.4 KiB | 00m00s [178/234] Installing libXext-0:1.3.6-2. 100% | 0.0 B/s | 91.3 KiB | 00m00s [179/234] Installing libXpm-0:3.5.17-4. 100% | 146.3 MiB/s | 149.8 KiB | 00m00s [180/234] Installing libXt-0:1.3.0-4.fc 100% | 210.5 MiB/s | 431.1 KiB | 00m00s [181/234] Installing graphite2-0:1.3.14 100% | 189.6 MiB/s | 194.1 KiB | 00m00s [182/234] Installing netpbm-0:11.02.00- 100% | 282.7 MiB/s | 579.0 KiB | 00m00s [183/234] Installing gts-0:0.7.6-49.201 100% | 214.0 MiB/s | 657.3 KiB | 00m00s [184/234] Installing libimagequant-0:4. 100% | 108.8 MiB/s | 668.3 KiB | 00m00s >>> Running pre-install scriptlet: xml-common-0:0.6.3-65.fc41.noarch >>> Stop pre-install scriptlet: xml-common-0:0.6.3-65.fc41.noarch [185/234] Installing xml-common-0:0.6.3 100% | 79.2 MiB/s | 81.1 KiB | 00m00s [186/234] Installing pixman-0:0.43.4-2. 100% | 351.2 MiB/s | 719.2 KiB | 00m00s [187/234] Installing cairo-0:1.18.0-4.f 100% | 348.7 MiB/s | 1.7 MiB | 00m00s [188/234] Installing harfbuzz-0:9.0.0-2 100% | 337.2 MiB/s | 2.7 MiB | 00m00s [189/234] Installing freetype-0:2.13.2- 100% | 277.4 MiB/s | 852.3 KiB | 00m00s [190/234] Installing fontconfig-0:2.15. 100% | 717.8 KiB/s | 811.1 KiB | 00m01s >>> Running post-install scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 >>> Stop post-install scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 [191/234] Installing cairo-gobject-0:1. 100% | 35.2 MiB/s | 36.1 KiB | 00m00s [192/234] Installing gd-0:2.3.3-17.fc41 100% | 131.8 MiB/s | 404.8 KiB | 00m00s [193/234] Installing libgs-0:10.03.1-3. 100% | 496.0 MiB/s | 23.3 MiB | 00m00s [194/234] Installing libXft-0:2.3.8-7.f 100% | 162.1 MiB/s | 166.0 KiB | 00m00s [195/234] Installing pango-0:1.54.0-2.f 100% | 244.6 MiB/s | 1.0 MiB | 00m00s [196/234] Installing librsvg2-0:2.57.1- 100% | 370.7 MiB/s | 4.1 MiB | 00m00s [197/234] Installing rsvg-pixbuf-loader 100% | 0.0 B/s | 16.5 KiB | 00m00s [198/234] Installing lasi-0:1.1.3-14.fc 100% | 129.2 MiB/s | 132.3 KiB | 00m00s [199/234] Installing poppler-0:24.02.0- 100% | 386.3 MiB/s | 3.5 MiB | 00m00s [200/234] Installing poppler-glib-0:24. 100% | 281.3 MiB/s | 576.1 KiB | 00m00s [201/234] Installing graphviz-0:12.0.0- 100% | 417.1 MiB/s | 20.0 MiB | 00m00s [202/234] Installing vim-filesystem-2:9 100% | 4.6 MiB/s | 4.7 KiB | 00m00s [203/234] Installing emacs-filesystem-1 100% | 0.0 B/s | 544.0 B | 00m00s [204/234] Installing less-0:661-2.fc41. 100% | 199.5 MiB/s | 408.6 KiB | 00m00s [205/234] Installing git-core-0:2.46.0- 100% | 452.3 MiB/s | 22.2 MiB | 00m00s [206/234] Installing git-core-doc-0:2.4 100% | 393.4 MiB/s | 17.3 MiB | 00m00s [207/234] Installing perl-Git-0:2.46.0- 100% | 0.0 B/s | 65.1 KiB | 00m00s [208/234] Installing git-0:2.46.0-1.fc4 100% | 85.4 MiB/s | 87.4 KiB | 00m00s [209/234] Installing rhash-0:1.4.4-2.fc 100% | 173.4 MiB/s | 355.1 KiB | 00m00s [210/234] Installing libuv-1:1.48.0-2.f 100% | 268.4 MiB/s | 549.6 KiB | 00m00s [211/234] Installing jsoncpp-0:1.9.5-8. 100% | 27.7 MiB/s | 254.9 KiB | 00m00s [212/234] Installing cmake-data-0:3.28. 100% | 134.6 MiB/s | 8.5 MiB | 00m00s [213/234] Installing cmake-0:3.28.3-7.f 100% | 381.3 MiB/s | 31.6 MiB | 00m00s [214/234] Installing xapian-core-libs-0 100% | 345.7 MiB/s | 2.1 MiB | 00m00s [215/234] Installing cuda-nvrtc-12-6-0: 100% | 247.7 MiB/s | 60.4 MiB | 00m00s >>> Running post-install scriptlet: cuda-nvrtc-12-6-0:12.6.20-1.x86_64 >>> Stop post-install scriptlet: cuda-nvrtc-12-6-0:12.6.20-1.x86_64 [216/234] Installing cuda-nvvm-12-6-0:1 100% | 224.1 MiB/s | 56.0 MiB | 00m00s [217/234] Installing cuda-crt-12-6-0:12 100% | 140.0 MiB/s | 860.2 KiB | 00m00s [218/234] Installing cuda-cccl-12-6-0:1 100% | 212.5 MiB/s | 11.9 MiB | 00m00s [219/234] Installing libcudnn8-0:8.9.7. 100% | 424.1 MiB/s | 1.0 GiB | 00m02s [220/234] Installing libcudnn8-devel-0: 100% | 28.0 MiB/s | 200.7 KiB | 00m00s >>> Running post-install scriptlet: libcudnn8-devel-0:8.9.7.29-2.cuda12.3.x86_64 >>> Stop post-install scriptlet: libcudnn8-devel-0:8.9.7.29-2.cuda12.3.x86_64 [221/234] Installing cuda-cudart-devel- 100% | 233.7 MiB/s | 6.8 MiB | 00m00s [222/234] Installing cuda-nvcc-12-6-0:1 100% | 312.8 MiB/s | 204.3 MiB | 00m01s [223/234] Installing cuda-nvrtc-devel-1 100% | 264.8 MiB/s | 93.0 MiB | 00m00s [224/234] Installing doxygen-2:1.12.0-1 100% | 384.3 MiB/s | 19.2 MiB | 00m00s [225/234] Installing python3-devel-0:3. 100% | 164.6 MiB/s | 1.8 MiB | 00m00s [226/234] Installing python3-setuptools 100% | 261.8 MiB/s | 7.3 MiB | 00m00s [227/234] Installing annobin-plugin-gcc 100% | 64.3 MiB/s | 987.3 KiB | 00m00s >>> Running trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch >>> Stop trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch [228/234] Installing cuda-gcc-11-c++-0: 100% | 367.3 MiB/s | 54.0 MiB | 00m00s [229/234] Installing gcc-plugin-annobin 100% | 4.1 MiB/s | 58.7 KiB | 00m00s >>> Running trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch >>> Stop trigger-install scriptlet: redhat-rpm-config-0:293-1.fc41.noarch [230/234] Installing libcurand-devel-12 100% | 520.2 MiB/s | 2.1 MiB | 00m00s [231/234] Installing libcublas-devel-12 100% | 380.6 MiB/s | 1.1 MiB | 00m00s [232/234] Installing cuda-nvtx-12-6-0:1 100% | 201.4 MiB/s | 412.5 KiB | 00m00s [233/234] Installing cuda-nvml-devel-12 100% | 350.8 MiB/s | 1.4 MiB | 00m00s warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead warning: posix.exec(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.execute() instead [234/234] Installing cuda-driver-devel- 100% | 285.8 KiB/s | 132.3 KiB | 00m00s >>> Running post-transaction scriptlet: cuda-toolkit-12-6-config-common-0:12.6.3 >>> Stop post-transaction scriptlet: cuda-toolkit-12-6-config-common-0:12.6.37-1 >>> Running post-transaction scriptlet: urw-base35-bookman-fonts-0:20200910-23.f >>> Stop post-transaction scriptlet: urw-base35-bookman-fonts-0:20200910-23.fc41 >>> Running post-transaction scriptlet: urw-base35-c059-fonts-0:20200910-23.fc41 >>> Stop post-transaction scriptlet: urw-base35-c059-fonts-0:20200910-23.fc41.no >>> Running post-transaction scriptlet: urw-base35-d050000l-fonts-0:20200910-23. >>> Stop post-transaction scriptlet: urw-base35-d050000l-fonts-0:20200910-23.fc4 >>> Running post-transaction scriptlet: urw-base35-gothic-fonts-0:20200910-23.fc >>> Stop post-transaction scriptlet: urw-base35-gothic-fonts-0:20200910-23.fc41. >>> Running post-transaction scriptlet: urw-base35-nimbus-mono-ps-fonts-0:202009 >>> Stop post-transaction scriptlet: urw-base35-nimbus-mono-ps-fonts-0:20200910- >>> Running post-transaction scriptlet: urw-base35-nimbus-roman-fonts-0:20200910 >>> Stop post-transaction scriptlet: urw-base35-nimbus-roman-fonts-0:20200910-23 >>> Running post-transaction scriptlet: urw-base35-nimbus-sans-fonts-0:20200910- >>> Stop post-transaction scriptlet: urw-base35-nimbus-sans-fonts-0:20200910-23. >>> Running post-transaction scriptlet: urw-base35-p052-fonts-0:20200910-23.fc41 >>> Stop post-transaction scriptlet: urw-base35-p052-fonts-0:20200910-23.fc41.no >>> Running post-transaction scriptlet: urw-base35-standard-symbols-ps-fonts-0:2 >>> Stop post-transaction scriptlet: urw-base35-standard-symbols-ps-fonts-0:2020 >>> Running post-transaction scriptlet: urw-base35-z003-fonts-0:20200910-23.fc41 >>> Stop post-transaction scriptlet: urw-base35-z003-fonts-0:20200910-23.fc41.no >>> Running post-transaction scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 >>> Stop post-transaction scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 >>> Running trigger-install scriptlet: glibc-common-0:2.40-3.fc41.x86_64 >>> Stop trigger-install scriptlet: glibc-common-0:2.40-3.fc41.x86_64 >>> Running trigger-install scriptlet: info-0:7.1-3.fc41.x86_64 >>> Stop trigger-install scriptlet: info-0:7.1-3.fc41.x86_64 >>> Running trigger-install scriptlet: glib2-0:2.81.1-1.fc41.x86_64 >>> Stop trigger-install scriptlet: glib2-0:2.81.1-1.fc41.x86_64 >>> Running trigger-install scriptlet: shared-mime-info-0:2.3-6.fc41.x86_64 >>> Stop trigger-install scriptlet: shared-mime-info-0:2.3-6.fc41.x86_64 >>> Running trigger-install scriptlet: gdk-pixbuf2-0:2.42.12-6.fc41.x86_64 >>> Stop trigger-install scriptlet: gdk-pixbuf2-0:2.42.12-6.fc41.x86_64 >>> Running trigger-install scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 >>> Stop trigger-install scriptlet: fontconfig-0:2.15.0-8.fc41.x86_64 >>> Running trigger-install scriptlet: graphviz-0:12.0.0-3.fc41.x86_64 >>> Stop trigger-install scriptlet: graphviz-0:12.0.0-3.fc41.x86_64 Complete! Warning: skipped PGP checks for 22 packages from repositories: copr_base, copr_rezso_CUDA, http_developer_download_nvidia_com_compute_cuda_repos_rhel9_x86_64 Finish: build setup for cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Start: rpmbuild cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Building target platforms: x86_64 Building for target x86_64 sh: -c: line 1: unexpected EOF while looking for matching `"' setting SOURCE_DATE_EPOCH=1636416000 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.X4aen5 + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + test -d /builddir/build/BUILD/cutlass-3.5.0-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/cutlass-3.5.0-build + /usr/bin/rm -rf /builddir/build/BUILD/cutlass-3.5.0-build + /usr/bin/mkdir -p /builddir/build/BUILD/cutlass-3.5.0-build + /usr/bin/mkdir -p /builddir/build/BUILD/cutlass-3.5.0-build/SPECPARTS + RPM_EC=0 ++ jobs -p + exit 0 Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.tuVUme + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + cd /builddir/build/BUILD/cutlass-3.5.0-build + rm -rf cutlass + /usr/bin/mkdir -p cutlass + cd cutlass + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + git clone --depth 1 -n -b v3.5.0 https://github.com/NVIDIA/cutlass.git . Cloning into '.'... + git reset --hard v3.5.0 HEAD is now at 7d49e6c Updates for CUTLASS 3.5.0 (#1468) + git log --format=fuller commit 7d49e6c7e2f8896c47f586706e67e1fb215529dc Author: Vijay Thakkar AuthorDate: Thu Apr 11 21:33:40 2024 -0400 Commit: GitHub CommitDate: Thu Apr 11 21:33:40 2024 -0400 Updates for CUTLASS 3.5.0 (#1468) Patch #0 (cutlass-fp16.patch): + echo 'Patch #0 (cutlass-fp16.patch):' + /usr/bin/patch --no-backup-if-mismatch -f -p0 -b --suffix .fp16~ --fuzz=100 patching file include/cutlass/functional.h Hunk #1 succeeded at 217 with fuzz 3 (offset 128 lines). + sed -i /-rpath/d CMakeLists.txt + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.MOUkz7 + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + CFLAGS=' ' + export CFLAGS + CXXFLAGS=' ' + export CXXFLAGS + FFLAGS=' -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS=' -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd cutlass + mkdir -p build + pushd build ~/build/BUILD/cutlass-3.5.0-build/cutlass/build ~/build/BUILD/cutlass-3.5.0-build/cutlass + export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64/ + LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64/ + CFLAGS=' ' + export CFLAGS + CXXFLAGS=' ' + export CXXFLAGS + FFLAGS=' -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS=' -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON .. -DCMAKE_SKIP_RPATH=ON -DCMAKE_VERBOSE_MAKEFILE=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXE_LINKER_FLAGS=/usr/lib64/libstdc++.so.6 -DBUILD_TESTING=OFF -DCUTLASS_ENABLE_F16C=ON -DCUTLASS_ENABLE_TESTS=OFF -DCUTLASS_ENABLE_PROFILER=ON -DCUTLASS_ENABLE_EXAMPLES=OFF -DCUDA_PROPAGATE_HOST_FLAGS=OFF -DCMAKE_CUDA_HOST_COMPILER=/usr/bin/cuda-c++ -DCUTLASS_NVCC_EMBED_PTX=ON -DCUTLASS_NVCC_EMBED_CUBIN=ON '-DCUTLASS_NVCC_ARCHS=52;61;75;86;89;90' '-DCMAKE_CUDA_FLAGS=-Wl,--no-relax -Xfatbin=-compress-all --compiler-options -fPIC -Wno-deprecated-gpu-targets -allow-unsupported-compiler -D_SERIALIZE_H_INCLUDED' -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.6/bin/nvcc -- CMake Version: 3.28.3 -- CUTLASS 3.5.0 -- The CXX compiler identification is GNU 14.1.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- The CUDA compiler identification is NVIDIA 12.6.20 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /usr/local/cuda-12.6/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- CUDART: /usr/local/cuda-12.6/lib64/libcudart.so -- CUDA Driver: /usr/local/cuda-12.6/lib64/stubs/libcuda.so -- NVRTC: /usr/local/cuda-12.6/lib64/libnvrtc.so -- Default Install Location: /usr -- Found Python3: /usr/bin/python3.13 (found suitable version "3.13.0", minimum required is "3.5") found components: Interpreter CMake Warning at CMakeLists.txt:156 (message): Using unsupported or deprecated compute capabilities 52;61. Support may be removed in future versions. -- CUDA Compilation Architectures: 52;61;75;86;89;90 -- Enable caching of reference results in conv unit tests -- Enable rigorous conv problem sizes in conv unit tests -- Using NVCC flags: --expt-relaxed-constexpr;-DCUTLASS_TEST_LEVEL=0;-DCUTLASS_TEST_ENABLE_CACHED_RESULTS=1;-DCUTLASS_CONV_UNIT_TEST_RIGOROUS_SIZE_ENABLED=1;-DCUTLASS_DEBUG_TRACE_LEVEL=0;-Xcompiler=-mf16c;-Xcompiler=-Wconversion;-Xcompiler=-fno-strict-aliasing -- CUTLASS Revision: 7d49e6c -- Configuring cublas ... -- cuBLAS Disabled. -- Configuring cuBLAS ... done. -- Completed generation of library instances. See /builddir/build/BUILD/cutlass-3.5.0-build/cutlass/build/tools/library/library_instance_generation.log for more information. -- Configuring done (3.5s) -- Generating done (1.0s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_C_FLAGS_RELEASE CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP CUDA_PROPAGATE_HOST_FLAGS INCLUDE_INSTALL_DIR LIB_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/cutlass-3.5.0-build/cutlass/build + make -j4 [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/handle.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_cgemm_objs.dir/generated/gemm/50/cgemm/all_sm50_cgemm_gemm_operations.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_dgemm_objs.dir/generated/gemm/50/dgemm/all_sm50_dgemm_gemm_operations.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684symm_objs.dir/generated/symm/90/z1684symm/all_sm90_z1684symm_symm_operations.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684symm_objs.dir/generated/symm/90/z1684symm/cutlass_tensorop_z1684symm_128x64x8_1x1x1_3_n_ls_l_align1.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_cgemm_objs.dir/generated/gemm/50/cgemm/cutlass_simt_cgemm_128x64_8x2_nn_align1.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_dgemm_objs.dir/generated/gemm/50/dgemm/cutlass_simt_dgemm_128x128_8x2_nn_align1.cu.o [ 0%] Building CXX object tools/library/CMakeFiles/cutlass_library_objs.dir/src/manifest.cpp.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/operation_table.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/singleton.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684symm_objs.dir/generated/symm/90/z1684symm/cutlass_tensorop_z1684symm_128x64x8_1x1x1_3_n_ls_u_align1.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/util.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_dgemm_objs.dir/generated/gemm/50/dgemm/cutlass_simt_dgemm_128x128_8x2_nt_align1.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_cgemm_objs.dir/generated/gemm/50/cgemm/cutlass_simt_cgemm_128x64_8x2_nt_align1.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_int4.cu.o [ 0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684symm_objs.dir/generated/symm/90/z1684symm/cutlass_tensorop_z1684symm_128x64x8_1x1x1_3_n_rs_l_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_dgemm_objs.dir/generated/gemm/50/dgemm/cutlass_simt_dgemm_128x128_8x2_tn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_cgemm_objs.dir/generated/gemm/50/cgemm/cutlass_simt_cgemm_128x64_8x2_tn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684symm_objs.dir/generated/symm/90/z1684symm/cutlass_tensorop_z1684symm_128x64x8_1x1x1_3_n_rs_u_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_dgemm_objs.dir/generated/gemm/50/dgemm/cutlass_simt_dgemm_128x128_8x2_tt_align1.cu.o [ 1%] Built target cutlass_library_symm_sm90_z1684symm_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_sgemm_objs.dir/generated/gemm/50/sgemm/all_sm50_sgemm_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_cgemm_objs.dir/generated/gemm/50/cgemm/cutlass_simt_cgemm_128x64_8x2_tt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_sgemm_objs.dir/generated/gemm/50/sgemm/cutlass_simt_sgemm_128x128_8x2_nn_align1.cu.o [ 1%] Built target cutlass_library_gemm_sm50_dgemm_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm60_hgemm_objs.dir/generated/gemm/60/hgemm/all_sm60_hgemm_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm60_hgemm_objs.dir/generated/gemm/60/hgemm/cutlass_simt_hgemm_256x128_8x2_nn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_int8_canonical.cu.o [ 1%] Built target cutlass_library_gemm_sm50_cgemm_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_igemm_s8_objs.dir/generated/gemm/61/igemm_s8/all_sm61_igemm_s8_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_sgemm_objs.dir/generated/gemm/50/sgemm/cutlass_simt_sgemm_128x128_8x2_nt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_igemm_s8_objs.dir/generated/gemm/61/igemm_s8/cutlass_simt_igemm_s8_128x128_32x2_nn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm60_hgemm_objs.dir/generated/gemm/60/hgemm/cutlass_simt_hgemm_256x128_8x2_nt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_sgemm_objs.dir/generated/gemm/50/sgemm/cutlass_simt_sgemm_128x128_8x2_tn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_igemm_s8_objs.dir/generated/gemm/61/igemm_s8/cutlass_simt_igemm_s8_128x128_32x2_nt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm50_sgemm_objs.dir/generated/gemm/50/sgemm/cutlass_simt_sgemm_128x128_8x2_tt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm60_hgemm_objs.dir/generated/gemm/60/hgemm/cutlass_simt_hgemm_256x128_8x2_tn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_igemm_s8_objs.dir/generated/gemm/61/igemm_s8/cutlass_simt_igemm_s8_128x128_32x2_tn_align1.cu.o [ 1%] Built target cutlass_library_gemm_sm50_sgemm_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_s8_igemm_s8_objs.dir/generated/gemm/61/s8_igemm_s8/all_sm61_s8_igemm_s8_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_igemm_s8_objs.dir/generated/gemm/61/igemm_s8/cutlass_simt_igemm_s8_128x128_32x2_tt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_s8_igemm_s8_objs.dir/generated/gemm/61/s8_igemm_s8/cutlass_simt_s8_igemm_s8_128x128_32x2_nn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm60_hgemm_objs.dir/generated/gemm/60/hgemm/cutlass_simt_hgemm_256x128_8x2_tt_align1.cu.o [ 1%] Built target cutlass_library_gemm_sm61_igemm_s8_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_f16_objs.dir/generated/gemm/70/f16_s884gemm_f16/all_sm70_f16_s884gemm_f16_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_s8_igemm_s8_objs.dir/generated/gemm/61/s8_igemm_s8/cutlass_simt_s8_igemm_s8_128x128_32x2_nt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_f16_objs.dir/generated/gemm/70/f16_s884gemm_f16/cutlass_tensorop_f16_s884gemm_f16_256x128_32x2_nn_align8.cu.o [ 1%] Built target cutlass_library_gemm_sm60_hgemm_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/all_sm70_f16_s884gemm_planar_complex_array_f16_gemm_operations.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_nn_align8.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_s8_igemm_s8_objs.dir/generated/gemm/61/s8_igemm_s8/cutlass_simt_s8_igemm_s8_128x128_32x2_tn_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_f16_objs.dir/generated/gemm/70/f16_s884gemm_f16/cutlass_tensorop_f16_s884gemm_f16_256x128_32x2_nt_align8.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_cn_align8.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm61_s8_igemm_s8_objs.dir/generated/gemm/61/s8_igemm_s8/cutlass_simt_s8_igemm_s8_128x128_32x2_tt_align1.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_f16_objs.dir/generated/gemm/70/f16_s884gemm_f16/cutlass_tensorop_f16_s884gemm_f16_256x128_32x2_tn_align8.cu.o [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_nc_align8.cu.o [ 1%] Built target cutlass_library_gemm_sm61_s8_igemm_s8_objs [ 1%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_cc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_f16_objs.dir/generated/gemm/70/f16_s884gemm_f16/cutlass_tensorop_f16_s884gemm_f16_256x128_32x2_tt_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_nt_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_ct_align8.cu.o [ 2%] Built target cutlass_library_gemm_sm70_f16_s884gemm_f16_objs [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/all_sm70_f16_s884gemm_planar_complex_f16_gemm_operations.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_nn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_nh_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_ch_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_cn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_int8_interleaved_32.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_tn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_hn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_nc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_tc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_hc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_cc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_int8_interleaved_64.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_tt_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_ht_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_nt_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_th_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_array_f16/cutlass_tensorop_f16_s884gemm_planar_complex_array_f16_64x64_32x2_hh_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_ct_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_nh_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_ch_align8.cu.o [ 2%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_objs [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_objs.dir/generated/gemm/70/h884gemm/all_sm70_h884gemm_gemm_operations.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_objs.dir/generated/gemm/70/h884gemm/cutlass_tensorop_h884gemm_256x128_32x2_nn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_tn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_hn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_e4m3a_e4m3out.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_objs.dir/generated/gemm/70/h884gemm/cutlass_tensorop_h884gemm_256x128_32x2_nt_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_tc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_hc_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_objs.dir/generated/gemm/70/h884gemm/cutlass_tensorop_h884gemm_256x128_32x2_tn_align8.cu.o [ 2%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_tt_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_ht_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_objs.dir/generated/gemm/70/h884gemm/cutlass_tensorop_h884gemm_256x128_32x2_tt_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_th_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/f16_s884gemm_planar_complex_f16/cutlass_tensorop_f16_s884gemm_planar_complex_f16_64x64_32x2_hh_align8.cu.o [ 3%] Built target cutlass_library_gemm_sm70_h884gemm_objs [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/all_sm70_h884gemm_planar_complex_gemm_operations.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_nn_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_cn_align8.cu.o [ 3%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_objs [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/all_sm70_h884gemm_planar_complex_array_gemm_operations.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_nn_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_nc_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_cc_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_cn_align8.cu.o [ 3%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_nt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_ct_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_nc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_nh_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_ch_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_tn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_cc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_hn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_tc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_nt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_ct_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_hc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_nh_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_ch_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_tt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_tn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_hn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_ht_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_tc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_hc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_th_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_tt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_ht_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_objs.dir/generated/gemm/70/h884gemm_planar_complex/cutlass_tensorop_h884gemm_planar_complex_64x64_32x2_hh_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_th_align8.cu.o [ 4%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex_objs [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_f16_objs.dir/generated/gemm/70/s884gemm_f16/all_sm70_s884gemm_f16_gemm_operations.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs.dir/generated/gemm/70/h884gemm_planar_complex_array/cutlass_tensorop_h884gemm_planar_complex_array_64x64_32x2_hh_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_f16_objs.dir/generated/gemm/70/s884gemm_f16/cutlass_tensorop_s884gemm_f16_256x128_32x2_nn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_f16_objs.dir/generated/gemm/70/s884gemm_f16/cutlass_tensorop_s884gemm_f16_256x128_32x2_nt_align8.cu.o [ 4%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex_array_objs [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/all_sm70_s884gemm_planar_complex_array_f16_gemm_operations.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_f16_objs.dir/generated/gemm/70/s884gemm_f16/cutlass_tensorop_s884gemm_f16_256x128_32x2_tn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_nn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_f16_objs.dir/generated/gemm/70/s884gemm_f16/cutlass_tensorop_s884gemm_f16_256x128_32x2_tt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_cn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_nc_align8.cu.o [ 4%] Built target cutlass_library_gemm_sm70_s884gemm_f16_objs [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/all_sm70_s884gemm_planar_complex_f16_gemm_operations.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_nn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_cc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_nt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_cn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_ct_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_nh_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_nc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_ch_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_tn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_cc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_hn_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_tc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_nt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_hc_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_tt_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_ct_align8.cu.o [ 4%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_ht_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_th_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_nh_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_array_f16/cutlass_tensorop_s884gemm_planar_complex_array_f16_64x64_32x2_hh_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_ch_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_tn_align8.cu.o [ 5%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_objs [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs.dir/generated/gemm/75/f16_s1688gemm_f16/all_sm75_f16_s1688gemm_f16_gemm_operations.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_hn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs.dir/generated/gemm/75/f16_s1688gemm_f16/cutlass_tensorop_f16_s1688gemm_f16_256x128_32x2_nn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_tc_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_hc_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs.dir/generated/gemm/75/f16_s1688gemm_f16/cutlass_tensorop_f16_s1688gemm_f16_256x128_32x2_nt_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_tt_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_ht_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs.dir/generated/gemm/75/f16_s1688gemm_f16/cutlass_tensorop_f16_s1688gemm_f16_256x128_32x2_tn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_th_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs.dir/generated/gemm/70/s884gemm_planar_complex_f16/cutlass_tensorop_s884gemm_planar_complex_f16_64x64_32x2_hh_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs.dir/generated/gemm/75/f16_s1688gemm_f16/cutlass_tensorop_f16_s1688gemm_f16_256x128_32x2_tt_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/all_sm75_f16_s1688gemm_planar_complex_array_f16_gemm_operations.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_nn_align8.cu.o [ 5%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_objs [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/all_sm75_f16_s1688gemm_planar_complex_f16_gemm_operations.cu.o [ 5%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_f16_objs [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_objs.dir/generated/gemm/75/h1688gemm/all_sm75_h1688gemm_gemm_operations.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_nn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_objs.dir/generated/gemm/75/h1688gemm/cutlass_tensorop_h1688gemm_256x128_32x2_nn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_cn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_cn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_objs.dir/generated/gemm/75/h1688gemm/cutlass_tensorop_h1688gemm_256x128_32x2_nt_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_nc_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_nc_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_objs.dir/generated/gemm/75/h1688gemm/cutlass_tensorop_h1688gemm_256x128_32x2_tn_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_cc_align8.cu.o [ 5%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_cc_align8.cu.o [ 6%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_objs.dir/generated/gemm/75/h1688gemm/cutlass_tensorop_h1688gemm_256x128_32x2_tt_align8.cu.o [ 6%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_nt_align8.cu.o [ 6%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_nt_align8.cu.o [ 6%] Built target cutlass_library_gemm_sm75_h1688gemm_objs [ 6%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/all_sm75_h1688gemm_planar_complex_gemm_operations.cu.o [ 6%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_nn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_ct_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_ct_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_cn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_e5m2a_e4m3out.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_nh_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_nh_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_nc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_ch_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_ch_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_cc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_tn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_tn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_nt_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_hn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_hn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_ct_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_tc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_tc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_nh_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_hc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_hc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_tt_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_ch_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_tt_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_ht_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_tn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_ht_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_th_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_hn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_th_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_f16_64x128_32x2_hh_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_tc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/f16_s1688gemm_planar_complex_array_f16/cutlass_tensorop_f16_s1688gemm_planar_complex_array_f16_64x128_32x2_hh_align8.cu.o [ 7%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_objs [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/all_sm75_h1688gemm_planar_complex_array_gemm_operations.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_hc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_nn_align8.cu.o [ 7%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_objs [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i88128xorgemm_b1_objs.dir/generated/gemm/75/i88128xorgemm_b1/all_sm75_i88128xorgemm_b1_gemm_operations.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_tt_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i88128xorgemm_b1_objs.dir/generated/gemm/75/i88128xorgemm_b1/cutlass_tensorop_i88128xorgemm_b1_256x128_512x2_tn_align128.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_cn_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_ht_align8.cu.o [ 7%] Built target cutlass_library_gemm_sm75_i88128xorgemm_b1_objs [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8816gemm_s8_objs.dir/generated/gemm/75/i8816gemm_s8/all_sm75_i8816gemm_s8_gemm_operations.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8816gemm_s8_objs.dir/generated/gemm/75/i8816gemm_s8/cutlass_tensorop_i8816gemm_s8_256x128_64x2_tn_align16.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_nc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_th_align8.cu.o [ 7%] Built target cutlass_library_gemm_sm75_i8816gemm_s8_objs [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8816gemm_u8_objs.dir/generated/gemm/75/i8816gemm_u8/all_sm75_i8816gemm_u8_gemm_operations.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_cc_align8.cu.o [ 7%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8816gemm_u8_objs.dir/generated/gemm/75/i8816gemm_u8/cutlass_tensorop_i8816gemm_u8_256x128_64x2_tn_align16.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs.dir/generated/gemm/75/h1688gemm_planar_complex/cutlass_tensorop_h1688gemm_planar_complex_64x128_32x2_hh_align8.cu.o [ 8%] Built target cutlass_library_gemm_sm75_i8816gemm_u8_objs [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8832gemm_s4_objs.dir/generated/gemm/75/i8832gemm_s4/all_sm75_i8832gemm_s4_gemm_operations.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_nt_align8.cu.o [ 8%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex_objs [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8832gemm_s4_objs.dir/generated/gemm/75/i8832gemm_s4/cutlass_tensorop_i8832gemm_s4_256x128_128x2_tn_align32.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8832gemm_u4_objs.dir/generated/gemm/75/i8832gemm_u4/all_sm75_i8832gemm_u4_gemm_operations.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_i8832gemm_u4_objs.dir/generated/gemm/75/i8832gemm_u4/cutlass_tensorop_i8832gemm_u4_256x128_128x2_tn_align32.cu.o [ 8%] Built target cutlass_library_gemm_sm75_i8832gemm_s4_objs [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_f16_objs.dir/generated/gemm/75/s1688gemm_f16/all_sm75_s1688gemm_f16_gemm_operations.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_ct_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_f16_objs.dir/generated/gemm/75/s1688gemm_f16/cutlass_tensorop_s1688gemm_f16_256x128_32x2_nn_align8.cu.o [ 8%] Built target cutlass_library_gemm_sm75_i8832gemm_u4_objs [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/all_sm75_s1688gemm_planar_complex_array_f16_gemm_operations.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_nn_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_nh_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_f16_objs.dir/generated/gemm/75/s1688gemm_f16/cutlass_tensorop_s1688gemm_f16_256x128_32x2_nt_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_cn_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_f16_objs.dir/generated/gemm/75/s1688gemm_f16/cutlass_tensorop_s1688gemm_f16_256x128_32x2_tn_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_ch_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_nc_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_f16_objs.dir/generated/gemm/75/s1688gemm_f16/cutlass_tensorop_s1688gemm_f16_256x128_32x2_tt_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_tn_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_cc_align8.cu.o [ 8%] Built target cutlass_library_gemm_sm75_s1688gemm_f16_objs [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/all_sm75_s1688gemm_planar_complex_f16_gemm_operations.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_nn_align8.cu.o [ 8%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_hn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_nt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_cn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_ct_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_tc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_nc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_nh_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_hc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_cc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_ch_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_nt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_tt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_tn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_ct_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_ht_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_hn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_nh_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_th_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_tc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_ch_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_hc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_tn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs.dir/generated/gemm/75/h1688gemm_planar_complex_array/cutlass_tensorop_h1688gemm_planar_complex_array_64x128_32x2_hh_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_e4m3a_e5m2out.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_tt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_hn_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s4_i8832gemm_s4_objs.dir/generated/gemm/75/s4_i8832gemm_s4/all_sm75_s4_i8832gemm_s4_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s4_i8832gemm_s4_objs.dir/generated/gemm/75/s4_i8832gemm_s4/cutlass_tensorop_s4_i8832gemm_s4_256x128_128x2_tn_align32.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_tc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_ht_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s4_i8832gemm_s4_objs.dir/generated/gemm/75/s4_i8832gemm_s4/cutlass_tensorop_s4_i8832gemm_s4_256x128_128x2_n64t64_align32.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_hc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_th_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm75_s4_i8832gemm_s4_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s8_i8816gemm_s8_objs.dir/generated/gemm/75/s8_i8816gemm_s8/all_sm75_s8_i8816gemm_s8_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_tt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_array_f16/cutlass_tensorop_s1688gemm_planar_complex_array_f16_64x128_32x2_hh_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s8_i8816gemm_s8_objs.dir/generated/gemm/75/s8_i8816gemm_s8/cutlass_tensorop_s8_i8816gemm_s8_256x128_64x2_tn_align16.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_ht_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u4_i8832gemm_u4_objs.dir/generated/gemm/75/u4_i8832gemm_u4/all_sm75_u4_i8832gemm_u4_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s8_i8816gemm_s8_objs.dir/generated/gemm/75/s8_i8816gemm_s8/cutlass_tensorop_s8_i8816gemm_s8_256x128_64x2_n32t32_align16.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u4_i8832gemm_u4_objs.dir/generated/gemm/75/u4_i8832gemm_u4/cutlass_tensorop_u4_i8832gemm_u4_256x128_128x2_tn_align32.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_th_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm75_s8_i8816gemm_s8_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u8_i8816gemm_u8_objs.dir/generated/gemm/75/u8_i8816gemm_u8/all_sm75_u8_i8816gemm_u8_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u4_i8832gemm_u4_objs.dir/generated/gemm/75/u4_i8832gemm_u4/cutlass_tensorop_u4_i8832gemm_u4_256x128_128x2_n64t64_align32.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u8_i8816gemm_u8_objs.dir/generated/gemm/75/u8_i8816gemm_u8/cutlass_tensorop_u8_i8816gemm_u8_256x128_64x2_tn_align16.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs.dir/generated/gemm/75/s1688gemm_planar_complex_f16/cutlass_tensorop_s1688gemm_planar_complex_f16_64x128_32x2_hh_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm75_u4_i8832gemm_u4_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16/all_sm80_bf16_s16816gemm_bf16_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm75_u8_i8816gemm_u8_objs.dir/generated/gemm/75/u8_i8816gemm_u8/cutlass_tensorop_u8_i8816gemm_u8_256x128_64x2_n32t32_align16.cu.o [ 9%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_s8_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16_s8/all_sm80_bf16_s16816gemm_bf16_s8_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16/cutlass_tensorop_bf16_s16816gemm_bf16_256x128_32x3_nn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_s8_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16_s8/cutlass_tensorop_bf16_s16816gemm_bf16_s8_128x128_64x4_tn_align16.cu.o [ 9%] Built target cutlass_library_gemm_sm75_u8_i8816gemm_u8_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_u8_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16_u8/all_sm80_bf16_s16816gemm_bf16_u8_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16/cutlass_tensorop_bf16_s16816gemm_bf16_256x128_32x3_nt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_u8_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16_u8/cutlass_tensorop_bf16_s16816gemm_bf16_u8_128x128_64x4_tn_align16.cu.o [ 9%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_s8_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/all_sm80_bf16_s16816gemm_planar_complex_array_bf16_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_nn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16/cutlass_tensorop_bf16_s16816gemm_bf16_256x128_32x3_tn_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_u8_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/all_sm80_bf16_s16816gemm_planar_complex_bf16_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_nn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_cn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_bf16/cutlass_tensorop_bf16_s16816gemm_bf16_256x128_32x3_tt_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_cn_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_nc_align8.cu.o [ 9%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_objs [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_s8_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_s8_bf16/all_sm80_bf16_s16816gemm_s8_bf16_gemm_operations.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_nc_align8.cu.o [ 9%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_s8_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_s8_bf16/cutlass_tensorop_bf16_s16816gemm_s8_bf16_128x128_64x4_tn_align16.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_cc_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_cc_align8.cu.o [ 10%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_s8_bf16_objs [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_u8_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_u8_bf16/all_sm80_bf16_s16816gemm_u8_bf16_gemm_operations.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_nt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_u8_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_u8_bf16/cutlass_tensorop_bf16_s16816gemm_u8_bf16_128x128_64x4_tn_align16.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_nt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_ct_align8.cu.o [ 10%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_u8_bf16_objs [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs.dir/generated/gemm/80/bf16_s16832spgemm_bf16/all_sm80_bf16_s16832spgemm_bf16_gemm_operations.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_ct_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs.dir/generated/gemm/80/bf16_s16832spgemm_bf16/cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_nh_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_nh_align8.cu.o ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002e51_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs.dir/generated/gemm/80/bf16_s16832spgemm_bf16/cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_ch_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_ch_align8.cu.o ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002ebe_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs.dir/generated/gemm/80/bf16_s16832spgemm_bf16/cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_tn_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_tn_align8.cu.o ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f2b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs.dir/generated/gemm/80/bf16_s16832spgemm_bf16/cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_hn_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_hn_align8.cu.o ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-8_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-7_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00002f9b_00000000-6_cutlass_tensorop_bf16_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_tc_align8.cu.o [ 10%] Built target cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_objs [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/all_sm80_c1688gemm_gemm_operations.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_tc_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_nn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_hc_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_hc_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_cn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_tt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_tt_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_nc_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_ht_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_ht_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_cc_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_th_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_th_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_nt_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_bf16_64x128_32x3_hh_align8.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/bf16_s16816gemm_planar_complex_array_bf16/cutlass_tensorop_bf16_s16816gemm_planar_complex_array_bf16_64x128_32x3_hh_align8.cu.o [ 10%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_objs [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/all_sm80_c1688tf32gemm_gemm_operations.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_ct_align1.cu.o [ 10%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_objs [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/all_sm80_cgemm_gemm_operations.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_nn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_nn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_nh_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_cn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_cn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_ch_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_nc_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_nc_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_tn_align1.cu.o [ 10%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_e5m2a_e5m2out.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_cc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_cc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_hn_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_nt_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_tc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_nt_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_ct_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_hc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_nh_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_ct_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_tt_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_ch_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_nh_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_ht_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_tn_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_ch_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_th_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_hn_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_tn_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688gemm_objs.dir/generated/gemm/80/c1688gemm/cutlass_tensorop_c1688gemm_128x64_16x3_hh_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_tc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_hn_align1.cu.o [ 11%] Built target cutlass_library_gemm_sm80_c1688gemm_objs [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_d884gemm_objs.dir/generated/gemm/80/d884gemm/all_sm80_d884gemm_gemm_operations.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_hc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_d884gemm_objs.dir/generated/gemm/80/d884gemm/cutlass_tensorop_d884gemm_128x128_16x3_nn_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_tc_align1.cu.o [ 11%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_tt_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_d884gemm_objs.dir/generated/gemm/80/d884gemm/cutlass_tensorop_d884gemm_128x128_16x3_nt_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_ht_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_d884gemm_objs.dir/generated/gemm/80/d884gemm/cutlass_tensorop_d884gemm_128x128_16x3_tn_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_hc_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_d884gemm_objs.dir/generated/gemm/80/d884gemm/cutlass_tensorop_d884gemm_128x128_16x3_tt_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_th_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_tt_align1.cu.o [ 12%] Built target cutlass_library_gemm_sm80_d884gemm_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_dgemm_objs.dir/generated/gemm/80/dgemm/all_sm80_dgemm_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_c1688tf32gemm_objs.dir/generated/gemm/80/c1688tf32gemm/cutlass_tensorop_c1688tf32gemm_128x128_16x4_hh_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_dgemm_objs.dir/generated/gemm/80/dgemm/cutlass_simt_dgemm_128x128_8x3_nn_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_ht_align1.cu.o [ 12%] Built target cutlass_library_gemm_sm80_c1688tf32gemm_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs.dir/generated/gemm/80/f16_s16816gemm_f16/all_sm80_f16_s16816gemm_f16_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_dgemm_objs.dir/generated/gemm/80/dgemm/cutlass_simt_dgemm_128x128_8x3_nt_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs.dir/generated/gemm/80/f16_s16816gemm_f16/cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_nn_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_th_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_dgemm_objs.dir/generated/gemm/80/dgemm/cutlass_simt_dgemm_128x128_8x3_tn_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs.dir/generated/gemm/80/f16_s16816gemm_f16/cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_nt_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_cgemm_objs.dir/generated/gemm/80/cgemm/cutlass_simt_cgemm_128x128_8x5_hh_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_dgemm_objs.dir/generated/gemm/80/dgemm/cutlass_simt_dgemm_128x128_8x3_tt_align1.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs.dir/generated/gemm/80/f16_s16816gemm_f16/cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_tn_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs.dir/generated/gemm/80/f16_s16816gemm_f16/cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_tt_align8.cu.o [ 12%] Built target cutlass_library_gemm_sm80_cgemm_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_s8_objs.dir/generated/gemm/80/f16_s16816gemm_f16_s8/all_sm80_f16_s16816gemm_f16_s8_gemm_operations.cu.o [ 12%] Built target cutlass_library_gemm_sm80_dgemm_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_u8_objs.dir/generated/gemm/80/f16_s16816gemm_f16_u8/all_sm80_f16_s16816gemm_f16_u8_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_s8_objs.dir/generated/gemm/80/f16_s16816gemm_f16_s8/cutlass_tensorop_f16_s16816gemm_f16_s8_128x128_64x4_tn_align16.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_f16_u8_objs.dir/generated/gemm/80/f16_s16816gemm_f16_u8/cutlass_tensorop_f16_s16816gemm_f16_u8_128x128_64x4_tn_align16.cu.o [ 12%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/all_sm80_f16_s16816gemm_planar_complex_array_f16_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_nn_align8.cu.o [ 12%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_s8_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/all_sm80_f16_s16816gemm_planar_complex_f16_gemm_operations.cu.o [ 12%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_u8_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_s8_f16_objs.dir/generated/gemm/80/f16_s16816gemm_s8_f16/all_sm80_f16_s16816gemm_s8_f16_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_nn_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_s8_f16_objs.dir/generated/gemm/80/f16_s16816gemm_s8_f16/cutlass_tensorop_f16_s16816gemm_s8_f16_128x128_64x4_tn_align16.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_cn_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_cn_align8.cu.o [ 12%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_s8_f16_objs [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_u8_f16_objs.dir/generated/gemm/80/f16_s16816gemm_u8_f16/all_sm80_f16_s16816gemm_u8_f16_gemm_operations.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_nc_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_u8_f16_objs.dir/generated/gemm/80/f16_s16816gemm_u8_f16/cutlass_tensorop_f16_s16816gemm_u8_f16_128x128_64x4_tn_align16.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_nc_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_cc_align8.cu.o [ 12%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_cc_align8.cu.o [ 12%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_u8_f16_objs [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs.dir/generated/gemm/80/f16_s16832spgemm_f16/all_sm80_f16_s16832spgemm_f16_gemm_operations.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs.dir/generated/gemm/80/f16_s16832spgemm_f16/cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_nt_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_nt_align8.cu.o ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003d91_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs.dir/generated/gemm/80/f16_s16832spgemm_f16/cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_ct_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_ct_align8.cu.o ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003dff_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs.dir/generated/gemm/80/f16_s16832spgemm_f16/cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_nh_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_nh_align8.cu.o ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003e6c_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs.dir/generated/gemm/80/f16_s16832spgemm_f16/cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_ch_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_ch_align8.cu.o ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-8_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-7_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00003ed8_00000000-6_cutlass_tensorop_f16_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 13%] Built target cutlass_library_gemm_sm80_f16_s16832spgemm_f16_objs [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/all_sm80_gz884gemm_gemm_operations.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_tn_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_tn_align8.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_nn_align1.cu.o [ 13%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_hn_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_hn_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_cn_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp8in_fp16out.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_tc_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_nc_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_tc_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_hc_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_cc_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_hc_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_tt_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_nt_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_tt_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_ct_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_ht_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_ht_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_nh_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_th_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_th_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_ch_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_f16_64x128_32x3_hh_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/f16_s16816gemm_planar_complex_array_f16/cutlass_tensorop_f16_s16816gemm_planar_complex_array_f16_64x128_32x3_hh_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_tn_align1.cu.o [ 14%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_objs [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_objs.dir/generated/gemm/80/h16816gemm/all_sm80_h16816gemm_gemm_operations.cu.o [ 14%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_objs [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_grouped_objs.dir/generated/gemm/80/h16816gemm_grouped/all_sm80_h16816gemm_grouped_gemm_operations.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_objs.dir/generated/gemm/80/h16816gemm/cutlass_tensorop_h16816gemm_256x128_32x3_nn_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_hn_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_grouped_objs.dir/generated/gemm/80/h16816gemm_grouped/cutlass_tensorop_h16816gemm_grouped_256x128_32x3_nn_align8_scheduleDevice.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_objs.dir/generated/gemm/80/h16816gemm/cutlass_tensorop_h16816gemm_256x128_32x3_nt_align8.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_tc_align1.cu.o [ 14%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_grouped_objs.dir/generated/gemm/80/h16816gemm_grouped/cutlass_tensorop_h16816gemm_grouped_256x128_32x3_nt_align8_scheduleDevice.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_objs.dir/generated/gemm/80/h16816gemm/cutlass_tensorop_h16816gemm_256x128_32x3_tn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_hc_align1.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_grouped_objs.dir/generated/gemm/80/h16816gemm_grouped/cutlass_tensorop_h16816gemm_grouped_256x128_32x3_tn_align8_scheduleDevice.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_tt_align1.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_objs.dir/generated/gemm/80/h16816gemm/cutlass_tensorop_h16816gemm_256x128_32x3_tt_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_grouped_objs.dir/generated/gemm/80/h16816gemm_grouped/cutlass_tensorop_h16816gemm_grouped_256x128_32x3_tt_align8_scheduleDevice.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_ht_align1.cu.o [ 15%] Built target cutlass_library_gemm_sm80_h16816gemm_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/all_sm80_h16816gemm_planar_complex_gemm_operations.cu.o [ 15%] Built target cutlass_library_gemm_sm80_h16816gemm_grouped_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/all_sm80_h16816gemm_planar_complex_array_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_nn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_th_align1.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_nn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_cn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_gz884gemm_objs.dir/generated/gemm/80/gz884gemm/cutlass_tensorop_gz884gemm_64x64_8x3_hh_align1.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_cn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp8in_bf16out.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_nc_align8.cu.o [ 15%] Built target cutlass_library_gemm_sm80_gz884gemm_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_s8_f16_objs.dir/generated/gemm/80/h16816gemm_s8_f16/all_sm80_h16816gemm_s8_f16_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_s8_f16_objs.dir/generated/gemm/80/h16816gemm_s8_f16/cutlass_tensorop_h16816gemm_s8_f16_128x128_64x4_tn_align16.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_nc_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_cc_align8.cu.o [ 15%] Built target cutlass_library_gemm_sm80_h16816gemm_s8_f16_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16832spgemm_objs.dir/generated/gemm/80/h16832spgemm/all_sm80_h16832spgemm_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_cc_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16832spgemm_objs.dir/generated/gemm/80/h16832spgemm/cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_nt_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_nt_align8.cu.o ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1127; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1131; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1135; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1139; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1143; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1147; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1151; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1155; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1159; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1163; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1167; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1421; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1425; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1429; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1433; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1437; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1441; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1445; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1449; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1453; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1457; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1461; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1465; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1469; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1473; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1477; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_86.ptx, line 1481; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1127; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1131; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1135; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1139; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1143; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1147; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1151; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1155; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1159; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1163; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1167; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1421; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1425; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1429; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1433; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1437; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1441; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1445; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1449; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1453; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1457; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1461; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1465; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1469; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1473; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1477; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_89.ptx, line 1481; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1127; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1131; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1135; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1139; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1143; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1147; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1151; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1155; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1159; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1163; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1167; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1421; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1425; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1429; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1433; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1437; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1441; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1445; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1449; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1453; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1457; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1461; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1465; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1469; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1473; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1477; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000473b_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nn_align8.compute_90.ptx, line 1481; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16832spgemm_objs.dir/generated/gemm/80/h16832spgemm/cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_ct_align8.cu.o ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1209; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1213; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1217; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1221; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1225; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1229; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1233; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1237; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1241; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1245; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1249; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1501; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1505; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1509; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1513; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1517; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_86.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_ct_align8.cu.o ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1209; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1213; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1217; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1221; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1225; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1229; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1233; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1237; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1241; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1245; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1249; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1501; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1505; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1509; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1513; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1517; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_89.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1209; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1213; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1217; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1221; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1225; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1229; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1233; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1237; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1241; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1245; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1249; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1501; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1505; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1509; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1513; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1517; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000047a6_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_nt_align8.compute_90.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16832spgemm_objs.dir/generated/gemm/80/h16832spgemm/cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_nh_align8.cu.o ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1112; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1116; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1120; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1124; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1128; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1132; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1136; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1140; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1144; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1148; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1152; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1156; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1403; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1407; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1411; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1415; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1419; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1423; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1427; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1431; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1435; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1439; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1443; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1447; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1451; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1455; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1459; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_86.ptx, line 1463; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_nh_align8.cu.o ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1112; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1116; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1120; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1124; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1128; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1132; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1136; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1140; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1144; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1148; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1152; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1156; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1403; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1407; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1411; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1415; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1419; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1423; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1427; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1431; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1435; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1439; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1443; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1447; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1451; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1455; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1459; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_89.ptx, line 1463; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1112; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1116; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1120; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1124; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1128; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1132; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1136; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1140; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1144; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1148; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1152; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1156; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1403; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1407; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1411; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1415; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1419; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1423; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1427; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1431; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1435; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1439; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1443; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1447; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1451; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1455; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1459; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000480d_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tn_align8.compute_90.ptx, line 1463; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16832spgemm_objs.dir/generated/gemm/80/h16832spgemm/cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_ch_align8.cu.o ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1191; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1195; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1199; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1203; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1207; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1211; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1215; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1219; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1223; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1227; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1231; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1500; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1504; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1508; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1512; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1516; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1520; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1524; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1528; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1532; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1536; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1540; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1544; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-8_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_86.ptx, line 1548; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1191; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1195; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1199; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1203; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1207; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1211; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1215; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1219; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1223; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1227; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1231; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1500; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1504; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1508; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1512; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1516; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1520; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1524; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1528; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1532; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1536; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1540; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1544; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-7_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_89.ptx, line 1548; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_ch_align8.cu.o ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1171; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1175; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1179; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1183; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1187; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1191; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1195; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1199; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1203; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1207; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1211; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1215; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1219; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1223; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1227; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1231; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1500; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1504; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1508; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1512; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1516; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1520; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1524; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1528; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1532; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1536; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1540; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1544; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004877_00000000-6_cutlass_tensorop_h16832spgemm_64x128_64x6_tt_align8.compute_90.ptx, line 1548; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Built target cutlass_library_gemm_sm80_h16832spgemm_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168128spgemm_s4_objs.dir/generated/gemm/80/i168128spgemm_s4/all_sm80_i168128spgemm_s4_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_tn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168128spgemm_s4_objs.dir/generated/gemm/80/i168128spgemm_s4/cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_tn_align8.cu.o ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 772; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 776; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 780; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 784; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 788; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 792; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 796; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 800; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 962; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 966; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 970; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 974; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 978; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 982; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 986; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-8_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 990; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 772; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 776; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 780; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 784; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 788; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 792; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 796; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 800; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 962; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 966; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 970; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 974; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 978; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 982; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 986; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-7_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 990; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 772; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 776; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 780; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 784; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 788; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 792; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 796; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 800; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 962; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 966; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 970; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 974; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 978; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 982; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 986; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004926_00000000-6_cutlass_tensorop_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 990; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas , line 3; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas , line 3; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_hn_align8.cu.o [ 15%] Built target cutlass_library_gemm_sm80_i168128spgemm_s4_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168256andgemm_b1_objs.dir/generated/gemm/80/i168256andgemm_b1/all_sm80_i168256andgemm_b1_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168256andgemm_b1_objs.dir/generated/gemm/80/i168256andgemm_b1/cutlass_tensorop_i168256andgemm_b1_256x128_512x3_tn_align128.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_hn_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_tc_align8.cu.o [ 15%] Built target cutlass_library_gemm_sm80_i168256andgemm_b1_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168256xorgemm_b1_objs.dir/generated/gemm/80/i168256xorgemm_b1/all_sm80_i168256xorgemm_b1_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_tc_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i168256xorgemm_b1_objs.dir/generated/gemm/80/i168256xorgemm_b1/cutlass_tensorop_i168256xorgemm_b1_256x128_512x3_tn_align128.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_hc_align8.cu.o [ 15%] Built target cutlass_library_gemm_sm80_i168256xorgemm_b1_objs [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16832gemm_s8_objs.dir/generated/gemm/80/i16832gemm_s8/all_sm80_i16832gemm_s8_gemm_operations.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_hc_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_tt_align8.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16832gemm_s8_objs.dir/generated/gemm/80/i16832gemm_s8/cutlass_tensorop_i16832gemm_s8_256x128_64x3_tn_align16.cu.o [ 15%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_tt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_ht_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_i16832gemm_s8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16832gemm_u8_objs.dir/generated/gemm/80/i16832gemm_u8/all_sm80_i16832gemm_u8_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16832gemm_u8_objs.dir/generated/gemm/80/i16832gemm_u8/cutlass_tensorop_i16832gemm_u8_256x128_64x3_tn_align16.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_th_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_ht_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_i16832gemm_u8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864gemm_s4_objs.dir/generated/gemm/80/i16864gemm_s4/all_sm80_i16864gemm_s4_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp8in_fp32out.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs.dir/generated/gemm/80/h16816gemm_planar_complex/cutlass_tensorop_h16816gemm_planar_complex_64x128_32x3_hh_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864gemm_s4_objs.dir/generated/gemm/80/i16864gemm_s4/cutlass_tensorop_i16864gemm_s4_256x128_128x3_tn_align32.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_th_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864gemm_u4_objs.dir/generated/gemm/80/i16864gemm_u4/all_sm80_i16864gemm_u4_gemm_operations.cu.o [ 16%] Built target cutlass_library_gemm_sm80_i16864gemm_s4_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864spgemm_s8_objs.dir/generated/gemm/80/i16864spgemm_s8/all_sm80_i16864spgemm_s8_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs.dir/generated/gemm/80/h16816gemm_planar_complex_array/cutlass_tensorop_h16816gemm_planar_complex_array_64x128_32x3_hh_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864gemm_u4_objs.dir/generated/gemm/80/i16864gemm_u4/cutlass_tensorop_i16864gemm_u4_256x128_128x3_tn_align32.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_i16864spgemm_s8_objs.dir/generated/gemm/80/i16864spgemm_s8/cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.cu.o ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 754; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 758; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 762; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 766; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 770; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 774; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 778; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 782; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 786; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 790; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 794; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 798; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 802; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 806; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 810; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 814; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1014; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1018; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1022; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1026; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1030; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1034; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1038; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1042; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1046; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1050; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1054; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1058; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1062; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1066; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1070; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-8_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1074; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 16%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_objs.dir/generated/gemm/80/s16816gemm_bf16/all_sm80_s16816gemm_bf16_gemm_operations.cu.o ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 754; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 758; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 762; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 766; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 770; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 774; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 778; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 782; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 786; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 790; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 794; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 798; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 802; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 806; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 810; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 814; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1014; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1018; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1022; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1026; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1030; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1034; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1038; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1042; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1046; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1050; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1054; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1058; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1062; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1066; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1070; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-7_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1074; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 16%] Built target cutlass_library_gemm_sm80_i16864gemm_u4_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_s8_objs.dir/generated/gemm/80/s16816gemm_bf16_s8/all_sm80_s16816gemm_bf16_s8_gemm_operations.cu.o ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 754; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 758; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 762; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 766; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 770; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 774; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 778; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 782; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 786; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 790; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 794; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 798; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 802; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 806; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 810; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 814; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1014; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1018; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1022; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1026; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1030; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1034; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1038; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1042; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1046; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1050; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1054; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1058; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1062; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1066; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1070; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00004d54_00000000-6_cutlass_tensorop_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1074; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 16%] Built target cutlass_library_gemm_sm80_i16864spgemm_s8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_u8_objs.dir/generated/gemm/80/s16816gemm_bf16_u8/all_sm80_s16816gemm_bf16_u8_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_objs.dir/generated/gemm/80/s16816gemm_bf16/cutlass_tensorop_s16816gemm_bf16_256x128_32x3_nn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_s8_objs.dir/generated/gemm/80/s16816gemm_bf16_s8/cutlass_tensorop_s16816gemm_bf16_s8_128x128_64x4_tn_align16.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_u8_objs.dir/generated/gemm/80/s16816gemm_bf16_u8/cutlass_tensorop_s16816gemm_bf16_u8_128x128_64x4_tn_align16.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_objs.dir/generated/gemm/80/s16816gemm_bf16/cutlass_tensorop_s16816gemm_bf16_256x128_32x3_nt_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_s8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_objs.dir/generated/gemm/80/s16816gemm_f16/all_sm80_s16816gemm_f16_gemm_operations.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_u8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_s8_objs.dir/generated/gemm/80/s16816gemm_f16_s8/all_sm80_s16816gemm_f16_s8_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_objs.dir/generated/gemm/80/s16816gemm_f16/cutlass_tensorop_s16816gemm_f16_256x128_32x3_nn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_s8_objs.dir/generated/gemm/80/s16816gemm_f16_s8/cutlass_tensorop_s16816gemm_f16_s8_128x128_64x4_tn_align16.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_objs.dir/generated/gemm/80/s16816gemm_bf16/cutlass_tensorop_s16816gemm_bf16_256x128_32x3_tn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_objs.dir/generated/gemm/80/s16816gemm_f16/cutlass_tensorop_s16816gemm_f16_256x128_32x3_nt_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_s8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_u8_objs.dir/generated/gemm/80/s16816gemm_f16_u8/all_sm80_s16816gemm_f16_u8_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_bf16_objs.dir/generated/gemm/80/s16816gemm_bf16/cutlass_tensorop_s16816gemm_bf16_256x128_32x3_tt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_u8_objs.dir/generated/gemm/80/s16816gemm_f16_u8/cutlass_tensorop_s16816gemm_f16_u8_128x128_64x4_tn_align16.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_objs.dir/generated/gemm/80/s16816gemm_f16/cutlass_tensorop_s16816gemm_f16_256x128_32x3_tn_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs.dir/generated/gemm/80/s16816gemm_grouped_bf16/all_sm80_s16816gemm_grouped_bf16_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_f16_objs.dir/generated/gemm/80/s16816gemm_f16/cutlass_tensorop_s16816gemm_f16_256x128_32x3_tt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs.dir/generated/gemm/80/s16816gemm_grouped_bf16/cutlass_tensorop_s16816gemm_grouped_bf16_256x128_32x3_nn_align8_scheduleDevice.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_u8_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs.dir/generated/gemm/80/s16816gemm_grouped_f16/all_sm80_s16816gemm_grouped_f16_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs.dir/generated/gemm/80/s16816gemm_grouped_f16/cutlass_tensorop_s16816gemm_grouped_f16_256x128_32x3_nn_align8_scheduleDevice.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/all_sm80_s16816gemm_planar_complex_array_bf16_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs.dir/generated/gemm/80/s16816gemm_grouped_bf16/cutlass_tensorop_s16816gemm_grouped_bf16_256x128_32x3_nt_align8_scheduleDevice.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_nn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs.dir/generated/gemm/80/s16816gemm_grouped_f16/cutlass_tensorop_s16816gemm_grouped_f16_256x128_32x3_nt_align8_scheduleDevice.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs.dir/generated/gemm/80/s16816gemm_grouped_bf16/cutlass_tensorop_s16816gemm_grouped_bf16_256x128_32x3_tn_align8_scheduleDevice.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_cn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs.dir/generated/gemm/80/s16816gemm_grouped_f16/cutlass_tensorop_s16816gemm_grouped_f16_256x128_32x3_tn_align8_scheduleDevice.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs.dir/generated/gemm/80/s16816gemm_grouped_bf16/cutlass_tensorop_s16816gemm_grouped_bf16_256x128_32x3_tt_align8_scheduleDevice.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_nc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs.dir/generated/gemm/80/s16816gemm_grouped_f16/cutlass_tensorop_s16816gemm_grouped_f16_256x128_32x3_tt_align8_scheduleDevice.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/all_sm80_s16816gemm_planar_complex_array_f16_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_cc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_nn_align8.cu.o [ 16%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_f16_objs [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/all_sm80_s16816gemm_planar_complex_bf16_gemm_operations.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_nn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp32out.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_nt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_cn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_cn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_ct_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_nc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_nc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_nh_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_cc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_cc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_ch_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_nt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_nt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_tn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_ct_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_ct_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_hn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_nh_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_nh_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_ch_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_tc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_ch_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_tn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_hc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_tn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_hn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_tt_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_hn_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_tc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_ht_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_tc_align8.cu.o [ 16%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_hc_align8.cu.o [ 17%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_th_align8.cu.o [ 17%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_hc_align8.cu.o [ 17%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp_other.cu.o [ 17%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_tt_align8.cu.o [ 17%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_bf16/cutlass_tensorop_s16816gemm_planar_complex_array_bf16_64x128_32x3_hh_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_tt_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_ht_align8.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/all_sm80_s16816gemm_planar_complex_f16_gemm_operations.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_ht_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_nn_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_th_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_th_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_cn_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_array_f16/cutlass_tensorop_s16816gemm_planar_complex_array_f16_64x128_32x3_hh_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_bf16/cutlass_tensorop_s16816gemm_planar_complex_bf16_64x128_32x3_hh_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_nc_align8.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_s8_bf16_objs.dir/generated/gemm/80/s16816gemm_s8_bf16/all_sm80_s16816gemm_s8_bf16_gemm_operations.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_s8_bf16_objs.dir/generated/gemm/80/s16816gemm_s8_bf16/cutlass_tensorop_s16816gemm_s8_bf16_128x128_64x4_tn_align16.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_s8_f16_objs.dir/generated/gemm/80/s16816gemm_s8_f16/all_sm80_s16816gemm_s8_f16_gemm_operations.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_cc_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_s8_f16_objs.dir/generated/gemm/80/s16816gemm_s8_f16/cutlass_tensorop_s16816gemm_s8_f16_128x128_64x4_tn_align16.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_bf16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_u8_bf16_objs.dir/generated/gemm/80/s16816gemm_u8_bf16/all_sm80_s16816gemm_u8_bf16_gemm_operations.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_nt_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_u8_bf16_objs.dir/generated/gemm/80/s16816gemm_u8_bf16/cutlass_tensorop_s16816gemm_u8_bf16_128x128_64x4_tn_align16.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_f16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cfprop_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cfprop_optimized_cf32/all_sm75_cf32_cfprop_optimized_cf32_conv2d_operations.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cfprop_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cfprop_optimized_cf32/cutlass_simt_cf32_cfprop_optimized_cf32_128x128_8x5_nhwc_align1.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_ct_align8.cu.o [ 18%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_bf16_objs [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_nh_align8.cu.o [ 18%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_ch_align8.cu.o [ 18%] Built target cutlass_library_conv2d_sm75_cf32_cfprop_optimized_cf32_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_u8_f16_objs.dir/generated/gemm/80/s16816gemm_u8_f16/all_sm80_s16816gemm_u8_f16_gemm_operations.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_tn_align8.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_u8_f16_objs.dir/generated/gemm/80/s16816gemm_u8_f16/cutlass_tensorop_s16816gemm_u8_f16_128x128_64x4_tn_align16.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_hn_align8.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_tc_align8.cu.o [ 19%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_f16_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816tf32spgemm_objs.dir/generated/gemm/80/s16816tf32spgemm/all_sm80_s16816tf32spgemm_gemm_operations.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_hc_align8.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816tf32spgemm_objs.dir/generated/gemm/80/s16816tf32spgemm/cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_tt_align8.cu.o ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 909; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 913; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 917; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 921; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 925; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 929; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 933; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 937; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 941; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 945; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 949; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 953; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 957; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 961; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 965; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 969; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1331; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1335; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1339; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1343; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1347; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1351; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1355; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1359; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1363; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1367; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1371; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1375; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1379; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1383; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1387; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_86.ptx, line 1391; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_ht_align8.cu.o ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 909; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 913; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 917; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 921; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 925; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 929; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 933; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 937; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 941; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 945; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 949; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 953; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 957; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 961; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 965; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 969; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1331; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1335; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1339; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1343; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1347; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1351; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1355; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1359; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1363; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1367; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1371; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1375; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1379; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1383; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1387; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_89.ptx, line 1391; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 845; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 849; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 853; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 857; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 861; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 865; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 869; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 873; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 877; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 881; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 885; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 889; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 893; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 897; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 901; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 905; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1203; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1207; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1211; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1215; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1219; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1223; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1227; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1231; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1235; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1239; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1243; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1247; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1251; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1255; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1259; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c2b_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nn_align4.compute_90.ptx, line 1263; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816tf32spgemm_objs.dir/generated/gemm/80/s16816tf32spgemm/cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/gemm_fp_mixed_input.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_th_align8.cu.o ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 905; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 909; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 913; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 917; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 921; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 925; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 929; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 933; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 937; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 941; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 945; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 949; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 953; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 957; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 961; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 965; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1333; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1337; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1341; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1345; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1349; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1353; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1357; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1361; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1365; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1369; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1373; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1377; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1381; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1385; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1389; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_86.ptx, line 1393; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs.dir/generated/gemm/80/s16816gemm_planar_complex_f16/cutlass_tensorop_s16816gemm_planar_complex_f16_64x128_32x3_hh_align8.cu.o ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 905; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 909; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 913; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 917; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 921; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 925; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 929; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 933; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 937; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 941; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 945; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 949; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 953; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 957; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 961; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 965; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1333; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1337; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1341; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1345; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1349; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1353; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1357; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1361; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1365; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1369; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1373; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1377; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1381; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1385; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1389; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_89.ptx, line 1393; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 841; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 845; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 849; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 853; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 857; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 861; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 865; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 869; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 873; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 877; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 881; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 885; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 889; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 893; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 897; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 901; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1209; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1213; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1217; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1221; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1225; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1229; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1233; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1237; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1241; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1245; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1249; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1253; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1257; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1261; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005c99_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_nt_align4.compute_90.ptx, line 1265; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816tf32spgemm_objs.dir/generated/gemm/80/s16816tf32spgemm/cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16816tf32spgemm_objs.dir/generated/gemm/80/s16816tf32spgemm/cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.cu.o [ 19%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_bf16_objs.dir/generated/gemm/80/s16832spgemm_bf16/all_sm80_s16832spgemm_bf16_gemm_operations.cu.o ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 923; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 927; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 931; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 935; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 939; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 943; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 947; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 951; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 955; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 959; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 963; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 967; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 971; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 975; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 979; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 983; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1343; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1347; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1351; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1355; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1359; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1363; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1367; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1371; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1375; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1379; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1383; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1387; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1391; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1395; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1399; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_86.ptx, line 1403; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 923; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 927; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 931; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 935; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 939; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 943; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 947; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 951; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 955; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 959; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 963; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 967; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 971; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 975; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 979; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 983; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1343; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1347; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1351; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1355; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1359; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1363; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1367; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1371; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1375; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1379; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1383; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1387; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1391; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1395; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1399; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_89.ptx, line 1403; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 859; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 863; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 867; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 871; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 875; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 879; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 883; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 887; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 891; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 895; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 899; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 903; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 907; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 911; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 915; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 919; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1215; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1219; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1223; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1227; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1231; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1235; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1239; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1243; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1247; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1251; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1255; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1259; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1263; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1267; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1271; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d17_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tn_align4.compute_90.ptx, line 1275; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_bf16_objs.dir/generated/gemm/80/s16832spgemm_bf16/cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_bf16_objs.dir/generated/gemm/80/s16832spgemm_bf16/cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.cu.o ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 926; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 930; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 934; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 938; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 942; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 946; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 950; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 954; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 958; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 962; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 966; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 970; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 974; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 978; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 982; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 986; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1349; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1353; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1357; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1361; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1365; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1369; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1373; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1377; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1381; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1385; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1389; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1393; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1397; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1401; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1405; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-8_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_86.ptx, line 1409; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 926; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 930; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 934; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 938; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 942; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 946; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 950; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 954; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 958; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 962; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 966; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 970; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 974; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 978; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 982; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 986; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1349; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1353; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1357; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1361; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1365; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1369; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1373; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1377; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1381; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1385; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1389; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1393; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1397; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1401; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1405; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-7_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_89.ptx, line 1409; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 862; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 866; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 870; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 874; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 878; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 882; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 886; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 890; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 894; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 898; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 902; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 906; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 910; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 914; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 918; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 922; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1221; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1225; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1229; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1233; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1237; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1241; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1245; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1249; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1253; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1257; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1261; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1265; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1269; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1273; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1277; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d3c_00000000-6_cutlass_tensorop_s16816tf32spgemm_128x64_32x3_tt_align4.compute_90.ptx, line 1281; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Built target cutlass_library_gemm_sm80_s16816tf32spgemm_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_f16_objs.dir/generated/gemm/80/s16832spgemm_f16/all_sm80_s16832spgemm_f16_gemm_operations.cu.o ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_86.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_86.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_f16_objs.dir/generated/gemm/80/s16832spgemm_f16/cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.cu.o ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_89.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_89.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d92_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nn_align8.compute_90.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005d9b_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_nt_align8.compute_90.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_bf16_objs.dir/generated/gemm/80/s16832spgemm_bf16/cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_bf16_objs.dir/generated/gemm/80/s16832spgemm_bf16/cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.cu.o ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_86.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_89.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_86.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1160; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1164; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1168; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1172; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1176; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1180; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1184; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1188; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1192; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1196; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1200; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1454; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1458; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1462; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1466; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1470; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1474; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1478; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1482; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1486; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1490; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1494; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1498; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1502; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1506; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1510; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005dfb_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nn_align8.compute_90.ptx, line 1514; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_89.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-8_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_86.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_f16_objs.dir/generated/gemm/80/s16832spgemm_f16/cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.cu.o ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e1c_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tn_align8.compute_90.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-7_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_89.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_f16_objs.dir/generated/gemm/80/s16832spgemm_f16/cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.cu.o ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e29_00000000-6_cutlass_tensorop_s16832spgemm_bf16_64x128_64x6_tt_align8.compute_90.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Built target cutlass_library_gemm_sm80_s16832spgemm_bf16_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688bf16gemm_objs.dir/generated/gemm/80/s1688bf16gemm/all_sm80_s1688bf16gemm_gemm_operations.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688bf16gemm_objs.dir/generated/gemm/80/s1688bf16gemm/cutlass_tensorop_s1688bf16gemm_256x128_16x3_nn_align4.cu.o ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_86.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_89.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_86.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1222; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1226; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1230; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1234; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1238; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1242; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1246; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1250; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1254; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1258; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1262; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1266; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1270; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1274; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1278; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1282; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1534; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1538; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1542; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1546; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1550; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1554; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1558; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1562; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1566; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1570; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1574; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1578; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1582; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1586; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1590; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e66_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_nt_align8.compute_90.ptx, line 1594; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_89.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s16832spgemm_f16_objs.dir/generated/gemm/80/s16832spgemm_f16/cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.cu.o ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1145; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1149; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1153; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1157; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1161; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1165; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1169; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1173; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1177; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1181; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1185; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1189; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1193; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1197; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1201; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1205; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1436; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1440; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1444; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1448; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1452; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1456; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1460; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1464; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1468; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1472; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1476; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1480; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1484; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1488; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1492; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005e81_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tn_align8.compute_90.ptx, line 1496; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688bf16gemm_objs.dir/generated/gemm/80/s1688bf16gemm/cutlass_tensorop_s1688bf16gemm_256x128_16x3_nt_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688bf16gemm_objs.dir/generated/gemm/80/s1688bf16gemm/cutlass_tensorop_s1688bf16gemm_256x128_16x3_tn_align4.cu.o ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-8_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_86.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-7_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_89.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1204; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1208; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1212; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1216; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1220; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1224; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1228; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1232; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1236; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1240; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1244; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1248; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1252; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1256; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1260; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1264; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1521; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1525; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1529; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1533; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1537; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1541; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1545; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1549; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1553; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1557; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1561; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1565; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1569; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1573; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1577; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00005ef0_00000000-6_cutlass_tensorop_s16832spgemm_f16_64x128_64x6_tt_align8.compute_90.ptx, line 1581; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 19%] Built target cutlass_library_gemm_sm80_s16832spgemm_f16_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688f16gemm_objs.dir/generated/gemm/80/s1688f16gemm/all_sm80_s1688f16gemm_gemm_operations.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688bf16gemm_objs.dir/generated/gemm/80/s1688bf16gemm/cutlass_tensorop_s1688bf16gemm_256x128_16x3_tt_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688f16gemm_objs.dir/generated/gemm/80/s1688f16gemm/cutlass_tensorop_s1688f16gemm_256x128_16x3_nn_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688f16gemm_objs.dir/generated/gemm/80/s1688f16gemm/cutlass_tensorop_s1688f16gemm_256x128_16x3_nt_align4.cu.o [ 19%] Built target cutlass_library_gemm_sm80_s1688bf16gemm_objs [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_objs.dir/generated/gemm/80/s1688gemm/all_sm80_s1688gemm_gemm_operations.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688f16gemm_objs.dir/generated/gemm/80/s1688f16gemm/cutlass_tensorop_s1688f16gemm_256x128_16x3_tn_align4.cu.o [ 19%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_objs.dir/generated/gemm/80/s1688gemm/cutlass_tensorop_s1688gemm_128x128_16x4_nn_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688f16gemm_objs.dir/generated/gemm/80/s1688f16gemm/cutlass_tensorop_s1688f16gemm_256x128_16x3_tt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_objs.dir/generated/gemm/80/s1688gemm/cutlass_tensorop_s1688gemm_128x128_16x4_nt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_objs.dir/generated/gemm/80/s1688gemm/cutlass_tensorop_s1688gemm_128x128_16x4_tn_align4.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s1688f16gemm_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_tf32_objs.dir/generated/gemm/80/s1688gemm_tf32/all_sm80_s1688gemm_tf32_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_tf32_objs.dir/generated/gemm/80/s1688gemm_tf32/cutlass_tensorop_s1688gemm_tf32_256x128_16x3_nn_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_objs.dir/generated/gemm/80/s1688gemm/cutlass_tensorop_s1688gemm_128x128_16x4_tt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_tf32_objs.dir/generated/gemm/80/s1688gemm_tf32/cutlass_tensorop_s1688gemm_tf32_256x128_16x3_nt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_tf32_objs.dir/generated/gemm/80/s1688gemm_tf32/cutlass_tensorop_s1688gemm_tf32_256x128_16x3_tn_align4.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s1688gemm_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688tf32gemm_objs.dir/generated/gemm/80/s1688tf32gemm/all_sm80_s1688tf32gemm_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688gemm_tf32_objs.dir/generated/gemm/80/s1688gemm_tf32/cutlass_tensorop_s1688gemm_tf32_256x128_16x3_tt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688tf32gemm_objs.dir/generated/gemm/80/s1688tf32gemm/cutlass_tensorop_s1688tf32gemm_256x128_16x3_nn_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688tf32gemm_objs.dir/generated/gemm/80/s1688tf32gemm/cutlass_tensorop_s1688tf32gemm_256x128_16x3_nt_align4.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s1688gemm_tf32_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s4_i168128spgemm_s4_objs.dir/generated/gemm/80/s4_i168128spgemm_s4/all_sm80_s4_i168128spgemm_s4_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688tf32gemm_objs.dir/generated/gemm/80/s1688tf32gemm/cutlass_tensorop_s1688tf32gemm_256x128_16x3_tn_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s4_i168128spgemm_s4_objs.dir/generated/gemm/80/s4_i168128spgemm_s4/cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s1688tf32gemm_objs.dir/generated/gemm/80/s1688tf32gemm/cutlass_tensorop_s1688tf32gemm_256x128_16x3_tt_align4.cu.o ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 773; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 777; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 781; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 785; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 789; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 793; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 797; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 801; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 963; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 967; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 971; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 975; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 979; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 983; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 987; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-8_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_86.ptx, line 991; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 773; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 777; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 781; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 785; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 789; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 793; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 797; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 801; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 963; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 967; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 971; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 975; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 979; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 983; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 987; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-7_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_89.ptx, line 991; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 773; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 777; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 781; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 785; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 789; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 793; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 797; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 801; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 963; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 967; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 971; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 975; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 979; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 983; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 987; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_0000624c_00000000-6_cutlass_tensorop_s4_i168128spgemm_s4_64x64_256x4_tn_align32.compute_90.ptx, line 991; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas , line 3; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas , line 3; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s4_i16864gemm_s4_objs.dir/generated/gemm/80/s4_i16864gemm_s4/all_sm80_s4_i16864gemm_s4_gemm_operations.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s4_i168128spgemm_s4_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s8_i16832gemm_s8_objs.dir/generated/gemm/80/s8_i16832gemm_s8/all_sm80_s8_i16832gemm_s8_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s4_i16864gemm_s4_objs.dir/generated/gemm/80/s4_i16864gemm_s4/cutlass_tensorop_s4_i16864gemm_s4_256x128_128x3_tn_align32.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s1688tf32gemm_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s8_i16832gemm_s8_objs.dir/generated/gemm/80/s8_i16832gemm_s8/cutlass_tensorop_s8_i16832gemm_s8_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s8_i16832gemm_s8_objs.dir/generated/gemm/80/s8_i16832gemm_s8/cutlass_tensorop_s8_i16832gemm_s8_256x128_64x3_n32t32_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s4_i16864gemm_s4_objs.dir/generated/gemm/80/s4_i16864gemm_s4/cutlass_tensorop_s4_i16864gemm_s4_256x128_128x3_n64t64_align32.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s8_i16864spgemm_s8_objs.dir/generated/gemm/80/s8_i16864spgemm_s8/all_sm80_s8_i16864spgemm_s8_gemm_operations.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s8_i16832gemm_s8_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_sgemm_objs.dir/generated/gemm/80/sgemm/all_sm80_sgemm_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_s8_i16864spgemm_s8_objs.dir/generated/gemm/80/s8_i16864spgemm_s8/cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_sgemm_objs.dir/generated/gemm/80/sgemm/cutlass_simt_sgemm_256x128_8x5_nn_align1.cu.o [ 20%] Built target cutlass_library_gemm_sm80_s4_i16864gemm_s4_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs.dir/generated/gemm/80/tf32_s1688gemm_tf32/all_sm80_tf32_s1688gemm_tf32_gemm_operations.cu.o ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-8_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_86.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-7_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs.dir/generated/gemm/80/tf32_s1688gemm_tf32/cutlass_tensorop_tf32_s1688gemm_tf32_256x128_16x3_nn_align4.cu.o ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_000063c7_00000000-6_cutlass_tensorop_s8_i16864spgemm_s8_128x64_128x3_tn_align16.compute_90.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 20%] Built target cutlass_library_gemm_sm80_s8_i16864spgemm_s8_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u4_i16864gemm_u4_objs.dir/generated/gemm/80/u4_i16864gemm_u4/all_sm80_u4_i16864gemm_u4_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u4_i16864gemm_u4_objs.dir/generated/gemm/80/u4_i16864gemm_u4/cutlass_tensorop_u4_i16864gemm_u4_256x128_128x3_tn_align32.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_sgemm_objs.dir/generated/gemm/80/sgemm/cutlass_simt_sgemm_256x128_8x5_nt_align1.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs.dir/generated/gemm/80/tf32_s1688gemm_tf32/cutlass_tensorop_tf32_s1688gemm_tf32_256x128_16x3_nt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u4_i16864gemm_u4_objs.dir/generated/gemm/80/u4_i16864gemm_u4/cutlass_tensorop_u4_i16864gemm_u4_256x128_128x3_n64t64_align32.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_sgemm_objs.dir/generated/gemm/80/sgemm/cutlass_simt_sgemm_256x128_8x5_tn_align1.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs.dir/generated/gemm/80/tf32_s1688gemm_tf32/cutlass_tensorop_tf32_s1688gemm_tf32_256x128_16x3_tn_align4.cu.o [ 20%] Built target cutlass_library_gemm_sm80_u4_i16864gemm_u4_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u8_i16832gemm_u8_objs.dir/generated/gemm/80/u8_i16832gemm_u8/all_sm80_u8_i16832gemm_u8_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u8_i16832gemm_u8_objs.dir/generated/gemm/80/u8_i16832gemm_u8/cutlass_tensorop_u8_i16832gemm_u8_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs.dir/generated/gemm/80/tf32_s1688gemm_tf32/cutlass_tensorop_tf32_s1688gemm_tf32_256x128_16x3_tt_align4.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_sgemm_objs.dir/generated/gemm/80/sgemm/cutlass_simt_sgemm_256x128_8x5_tt_align1.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_u8_i16832gemm_u8_objs.dir/generated/gemm/80/u8_i16832gemm_u8/cutlass_tensorop_u8_i16832gemm_u8_256x128_64x3_n32t32_align16.cu.o [ 20%] Built target cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/all_sm80_z884gemm_gemm_operations.cu.o [ 20%] Built target cutlass_library_gemm_sm80_u8_i16832gemm_u8_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_objs.dir/generated/gemm/89/s16832fastaccumgemm_e4m3/all_sm89_s16832fastaccumgemm_e4m3_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_nn_align1.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/initialize_reference_operations.cu.o [ 20%] Built target cutlass_library_gemm_sm80_sgemm_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16832fastaccumgemm_e4m3_e5m2/all_sm89_s16832fastaccumgemm_e4m3_e5m2_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_objs.dir/generated/gemm/89/s16832fastaccumgemm_e4m3/cutlass_tensorop_s16832fastaccumgemm_e4m3_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reduction/reduction_device.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16832fastaccumgemm_e4m3_e5m2/cutlass_tensorop_s16832fastaccumgemm_e4m3_e5m2_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reduction/init_reduction_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_cn_align1.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_objs.dir/generated/gemm/89/s16832fastaccumgemm_e5m2/all_sm89_s16832fastaccumgemm_e5m2_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/conv2d.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16832fastaccumgemm_e5m2_e4m3/all_sm89_s16832fastaccumgemm_e5m2_e4m3_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_objs.dir/generated/gemm/89/s16832fastaccumgemm_e5m2/cutlass_tensorop_s16832fastaccumgemm_e5m2_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16832fastaccumgemm_e5m2_e4m3/cutlass_tensorop_s16832fastaccumgemm_e5m2_e4m3_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_nc_align1.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e4m3_objs.dir/generated/gemm/89/s16832gemm_e4m3/all_sm89_s16832gemm_e4m3_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_cc_align1.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e4m3_objs.dir/generated/gemm/89/s16832gemm_e4m3/cutlass_tensorop_s16832gemm_e4m3_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16832gemm_e4m3_e5m2/all_sm89_s16832gemm_e4m3_e5m2_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16832gemm_e4m3_e5m2/cutlass_tensorop_s16832gemm_e4m3_e5m2_256x128_64x3_tn_align16.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_nt_align1.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e5m2_objs.dir/generated/gemm/89/s16832gemm_e5m2/all_sm89_s16832gemm_e5m2_gemm_operations.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/src/reference/conv3d.cu.o [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e5m2_objs.dir/generated/gemm/89/s16832gemm_e5m2/cutlass_tensorop_s16832gemm_e5m2_256x128_64x3_tn_align16.cu.o [ 20%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3_e5m2_objs [ 20%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16832gemm_e5m2_e4m3/all_sm89_s16832gemm_e5m2_e4m3_gemm_operations.cu.o [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_ct_align1.cu.o [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16832gemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16832gemm_e5m2_e4m3/cutlass_tensorop_s16832gemm_e5m2_e4m3_256x128_64x3_tn_align16.cu.o [ 21%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2_objs [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e4m3/all_sm89_s16864fastaccumspgemm_e4m3_gemm_operations.cu.o [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e4m3/cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.cu.o [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_nh_align1.cu.o [ 21%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2_e4m3_objs [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e4m3_e5m2/all_sm89_s16864fastaccumspgemm_e4m3_e5m2_gemm_operations.cu.o [ 21%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e4m3_e5m2/cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.cu.o ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a29_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 21%] Building CXX object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/initialize_all.cpp.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/gemm/all_gemm_operations.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e5m2/all_sm89_s16864fastaccumspgemm_e5m2_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_ch_align1.cu.o ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006a8b_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/conv2d/all_conv2d_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e5m2/cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e5m2_e4m3/all_sm89_s16864fastaccumspgemm_e5m2_e4m3_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/conv3d/all_conv3d_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16864fastaccumspgemm_e5m2_e4m3/cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_tn_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/rank_k/all_rank_k_operations.cu.o ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006b2c_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e4m3_objs.dir/generated/gemm/89/s16864spgemm_e4m3/all_sm89_s16864spgemm_e4m3_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/rank_2k/all_rank_2k_operations.cu.o ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006bad_00000000-7_cutlass_tensorop_s16864fastaccumspgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e4m3_objs.dir/generated/gemm/89/s16864spgemm_e4m3/cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/trmm/all_trmm_operations.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16864spgemm_e4m3_e5m2/all_sm89_s16864spgemm_e4m3_e5m2_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_hn_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_objs.dir/generated/symm/all_symm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e4m3_e5m2_objs.dir/generated/gemm/89/s16864spgemm_e4m3_e5m2/cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.cu.o ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006c64_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Built target cutlass_library_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e5m2_objs.dir/generated/gemm/89/s16864spgemm_e5m2/all_sm89_s16864spgemm_e5m2_gemm_operations.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16864spgemm_e5m2_e4m3/all_sm89_s16864spgemm_e5m2_e4m3_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_tc_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e5m2_objs.dir/generated/gemm/89/s16864spgemm_e5m2/cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.cu.o ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006cf6_00000000-7_cutlass_tensorop_s16864spgemm_e4m3_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm89_s16864spgemm_e5m2_e4m3_objs.dir/generated/gemm/89/s16864spgemm_e5m2_e4m3/cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3_e5m2_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/all_sm90_bf16_s64x128x16gemm_bf16_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8.cu.o ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006da3_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 755; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 759; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 763; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 767; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 771; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 775; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 779; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 783; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 787; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 791; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 795; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 799; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 803; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 807; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 811; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 815; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1015; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1019; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1023; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1027; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1031; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1035; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1039; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1043; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1047; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1051; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1055; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1059; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1063; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1067; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1071; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures ptxas /tmp/tmpxft_00006db7_00000000-7_cutlass_tensorop_s16864spgemm_e5m2_e4m3_128x64_128x3_tn_align16.compute_89.ptx, line 1075; info : Advisory: Modifier '.sp::ordered_metadata' should be used on instruction 'mma' instead of modifier '.sp' as it is expected to have substantially reduced performance on some future architectures [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_hc_align1.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/all_sm90_bf16_s64x128x32gemm_e4m3_gemm_operations.cu.o [ 22%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2_e4m3_objs [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/all_sm90_bf16_s64x128x32gemm_e4m3_e5m2_gemm_operations.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_tt_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_ht_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_th_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm80_z884gemm_objs.dir/generated/gemm/80/z884gemm/cutlass_tensorop_z884gemm_128x64_8x3_hh_align1.cu.o [ 22%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 22%] Built target cutlass_library_gemm_sm80_z884gemm_objs [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/all_sm90_bf16_s64x128x32gemm_e5m2_gemm_operations.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 23%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 24%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 25%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 26%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 27%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_tnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ttn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 28%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_nnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/bf16_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_1x1x1_0_ntn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Built target cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_objs [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/all_sm90_bf16_s64x128x32gemm_e5m2_e4m3_gemm_operations.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_objs [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_d1684gemm_objs.dir/generated/gemm/90/d1684gemm/all_sm90_d1684gemm_gemm_operations.cu.o [ 29%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_objs [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/all_sm90_f16_s64x128x16gemm_f16_gemm_operations.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_d1684gemm_objs.dir/generated/gemm/90/d1684gemm/cutlass_sm90_tensorop_d1684gemm_f64_f64_f64_f64_f64_128x128x16_1x1x1_3_nnn_align1.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_d1684gemm_objs.dir/generated/gemm/90/d1684gemm/cutlass_sm90_tensorop_d1684gemm_f64_f64_f64_f64_f64_128x128x16_1x1x1_3_ntn_align1.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 29%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_d1684gemm_objs.dir/generated/gemm/90/d1684gemm/cutlass_sm90_tensorop_d1684gemm_f64_f64_f64_f64_f64_128x128x16_1x1x1_3_tnn_align1.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_d1684gemm_objs.dir/generated/gemm/90/d1684gemm/cutlass_sm90_tensorop_d1684gemm_f64_f64_f64_f64_f64_128x128x16_1x1x1_3_ttn_align1.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Built target cutlass_library_gemm_sm90_d1684gemm_objs [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/all_sm90_f16_s64x128x32gemm_e4m3_gemm_operations.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_objs [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/all_sm90_f16_s64x128x32gemm_e4m3_e5m2_gemm_operations.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 30%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 31%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 32%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 33%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_bf16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_tnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ttn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_nnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/bf16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_bf16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs.dir/generated/gemm/90/f16_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f16_f16_128x128x64_1x1x1_0_ntn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 34%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_objs [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/all_sm90_f16_s64x128x32gemm_e5m2_gemm_operations.cu.o [ 34%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Built target cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_objs [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/all_sm90_f16_s64x128x32gemm_e5m2_e4m3_gemm_operations.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 35%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_objs [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/all_sm90_gz1684gemm_gemm_operations.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_nnn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_objs [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/all_sm90_h64x128x16gemm_gemm_operations.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_cnn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_ncn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_ccn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_ntn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_ctn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_nhn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_chn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_tnn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_hnn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_tcn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_hcn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_ttn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_htn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_thn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_gz1684gemm_objs.dir/generated/gemm/90/gz1684gemm/cutlass_sm90_tensorop_gz1684gemm_cf64_cf64_cf64_cf64_cf64_64x64x8_1x1x1_3_hhn_align1.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Built target cutlass_library_gemm_sm90_gz1684gemm_objs [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/all_sm90_i64x128x32gemm_s8_gemm_operations.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs.dir/generated/gemm/90/i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 36%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_s8_objs [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/all_sm90_i64x128x32gemm_u8_gemm_operations.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 36%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs.dir/generated/gemm/90/i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s32_s32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 37%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_u8_objs [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/all_sm90_s64x128x16gemm_bf16_gemm_operations.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 37%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 38%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 39%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 39%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_f16_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_void_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/f16_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f16_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_epi_nosmem.cu.o [ 40%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_objs [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/all_sm90_s64x128x16gemm_f16_gemm_operations.cu.o [ 40%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_objs [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/all_sm90_s64x128x32gemm_e4m3_gemm_operations.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 40%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 41%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_tnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ttn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_nnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_h64x128x16gemm_objs.dir/generated/gemm/90/h64x128x16gemm/cutlass3x_sm90_tensorop_h64x128x16gemm_f16_f16_f16_f16_f16_128x128x64_1x1x1_0_ntn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Built target cutlass_library_gemm_sm90_h64x128x16gemm_objs [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/all_sm90_s64x128x32gemm_e4m3_e5m2_gemm_operations.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 42%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 43%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_bf16_objs [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/all_sm90_s64x128x32gemm_e5m2_gemm_operations.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 43%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 44%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 45%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_tnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ttn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_nnn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs.dir/generated/gemm/90/s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_f32_f32_128x128x64_1x1x1_0_ntn_align2_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_f16_objs [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/all_sm90_s64x128x32gemm_e5m2_e4m3_gemm_operations.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_objs [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/all_sm90_s64x128x8gemm_tf32_gemm_operations.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 46%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 47%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 48%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_objs [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/all_sm90_s64x128x8tf32gemm_gemm_operations.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_256x128x32_1x2x1_0_tnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_256x128x32_1x2x1_0_nnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_256x128x32_1x2x1_0_ntn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_64x128x32_2x1x1_0_ttn_align4_warpspecialized_pingpong.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ttn_align4_warpspecialized_cooperative.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_2x1x1_0_ttn_align4_warpspecialized_pingpong.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_256x128x32_1x2x1_0_ttn_align4_warpspecialized_cooperative.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_tnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_fp8_fastaccum_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_tnn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 48%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_nnn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_ntn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_ttn_align2_cpasync_warpspecialized.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs.dir/generated/gemm/90/s64x128x8gemm_tf32/cutlass3x_sm90_tensorop_s64x128x8gemm_tf32_tf32_f32_f32_f32_128x128x32_1x1x1_0_ttn_align1_cpasync_warpspecialized.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_nnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_tma.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 49%] Built target cutlass_library_gemm_sm90_s64x128x8gemm_tf32_objs [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/all_sm90_s8_i64x128x32gemm_s8_gemm_operations.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16.cu.o [ 49%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_64x128x128_1x2x1_0_tnn_align16_warpspecialized_pingpong_fp8_fastaccum_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_pingpong_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_objs [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/all_sm90_s8_i64x128x32gemm_u8_gemm_operations.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ntn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_256x128x32_1x2x1_0_tnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x2x1_0_tnn_align16_stream_k_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_256x128x32_1x2x1_0_nnn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_256x128x32_1x2x1_0_ntn_align4_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_64x128x32_2x1x1_0_ttn_align4_warpspecialized_pingpong.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 50%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ttn_align4_warpspecialized_cooperative.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_f32_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_2x1x1_0_ttn_align4_warpspecialized_pingpong.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_s8_s8_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_256x128x32_1x2x1_0_ttn_align4_warpspecialized_cooperative.cu.o [ 51%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_objs [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/all_sm90_void_i64x128x32gemm_s8_gemm_operations.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_tnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_nnn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_ntn_align2_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e4m3_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs.dir/generated/gemm/90/s8_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_s8_u8_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_tnn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_objs [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/all_sm90_void_i64x128x32gemm_u8_gemm_operations.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_tma.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_nnn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_cpasync_warpspecialized_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_pingpong_epi_nosmem.cu.o [ 51%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_ntn_align1_cpasync_warpspecialized_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs.dir/generated/gemm/90/void_i64x128x32gemm_s8/cutlass3x_sm90_tensorop_i64x128x32gemm_s8_s8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_cpasync_warpspecialized_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_ttn_align2_cpasync_warpspecialized.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align8_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs.dir/generated/gemm/90/s64x128x8tf32gemm/cutlass3x_sm90_tensorop_s64x128x8tf32gemm_f32_f32_f32_f32_f32_128x128x32_1x1x1_0_ttn_align1_cpasync_warpspecialized.cu.o [ 52%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_objs [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/all_sm90_void_s64x128x16gemm_bf16_gemm_operations.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_f32_e5m2_128x128x128_1x1x1_0_tnn_align4_stream_k_cpasync_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Built target cutlass_library_gemm_sm90_s64x128x8tf32gemm_objs [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_objs [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/all_sm90_void_s64x128x16gemm_f16_gemm_operations.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs.dir/generated/gemm/90/void_i64x128x32gemm_u8/cutlass3x_sm90_tensorop_i64x128x32gemm_u8_u8_s32_void_s32_128x128x128_2x1x1_0_tnn_align16_stream_k_warpspecialized_cooperative_epi_nosmem.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_objs [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3/all_sm90_void_s64x128x32gemm_e4m3_gemm_operations.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e4m3_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_objs [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_f32_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_nnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ntn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 52%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_tnn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_pingpong_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ttn_align8_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs.dir/generated/gemm/90/void_s64x128x16gemm_f16/cutlass3x_sm90_tensorop_s64x128x16gemm_f16_f16_f32_void_f16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_nnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ntn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_objs [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3_e5m2/all_sm90_void_s64x128x32gemm_e4m3_e5m2_gemm_operations.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_128x128x64_2x1x1_0_ttn_align8_stream_k_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e4m3_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e4m3_e5m2_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2/all_sm90_void_s64x128x32gemm_e5m2_gemm_operations.cu.o [ 53%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2_e4m3/all_sm90_void_s64x128x32gemm_e5m2_e4m3_gemm_operations.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_objs [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/all_sm90_z1684gemm_gemm_operations.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_nnn_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_void_e5m2_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2_e4m3/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e4m3_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_cnn_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_ncn_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_epi_tma.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs.dir/generated/gemm/90/void_s64x128x32gemm_e5m2/cutlass3x_sm90_tensorop_s64x128x32gemm_e5m2_e5m2_f32_void_e4m3_256x128x128_1x2x1_0_tnn_align16_warpspecialized_cooperative_fp8_fastaccum_epi_tma.cu.o [ 53%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_objs [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cdgrad_optimized_cf32/all_sm50_cf32_cdgrad_optimized_cf32_conv2d_operations.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cdgrad_optimized_cf32/cutlass_simt_cf32_cdgrad_optimized_cf32_128x64_8x2_nhwc_unity_stride_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_ccn_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_ntn_align1.cu.o [ 53%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_objs [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cfprop_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cfprop_optimized_cf32/all_sm50_cf32_cfprop_optimized_cf32_conv2d_operations.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cdgrad_optimized_cf32/cutlass_simt_cf32_cdgrad_optimized_cf32_128x64_8x2_nhwc_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_ctn_align1.cu.o [ 53%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cfprop_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cfprop_optimized_cf32/cutlass_simt_cf32_cfprop_optimized_cf32_128x64_8x2_nhwc_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_nhn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_chn_align1.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_cf32_cfprop_optimized_cf32_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cwgrad_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cwgrad_optimized_cf32/all_sm50_cf32_cwgrad_optimized_cf32_conv2d_operations.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_sdgrad_optimized_objs.dir/generated/conv2d/50/sdgrad_optimized/all_sm50_sdgrad_optimized_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_tnn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_cf32_cwgrad_optimized_cf32_objs.dir/generated/conv2d/50/cf32_cwgrad_optimized_cf32/cutlass_simt_cf32_cwgrad_optimized_cf32_128x64_8x2_nhwc_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_sdgrad_optimized_objs.dir/generated/conv2d/50/sdgrad_optimized/cutlass_simt_sdgrad_optimized_128x128_8x2_nhwc_unity_stride_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_hnn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_tcn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_sdgrad_optimized_objs.dir/generated/conv2d/50/sdgrad_optimized/cutlass_simt_sdgrad_optimized_128x128_8x2_nhwc_align1.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_cf32_cwgrad_optimized_cf32_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_sfprop_optimized_objs.dir/generated/conv2d/50/sfprop_optimized/all_sm50_sfprop_optimized_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_hcn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_sfprop_optimized_objs.dir/generated/conv2d/50/sfprop_optimized/cutlass_simt_sfprop_optimized_128x128_8x2_nhwc_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_ttn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_htn_align1.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_sdgrad_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_swgrad_optimized_objs.dir/generated/conv2d/50/swgrad_optimized/all_sm50_swgrad_optimized_conv2d_operations.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_sfprop_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm60_hfprop_optimized_objs.dir/generated/conv2d/60/hfprop_optimized/all_sm60_hfprop_optimized_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm50_swgrad_optimized_objs.dir/generated/conv2d/50/swgrad_optimized/cutlass_simt_swgrad_optimized_128x128_8x2_nhwc_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_thn_align1.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm60_hfprop_optimized_objs.dir/generated/conv2d/60/hfprop_optimized/cutlass_simt_hfprop_optimized_64x32x9_1x8x8x32_3_filter3x3_nhwc_depthwise_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_z1684gemm_objs.dir/generated/gemm/90/z1684gemm/cutlass_sm90_tensorop_z1684gemm_cf64_cf64_cf64_cf64_cf64_128x64x8_1x1x1_3_hhn_align1.cu.o [ 54%] Built target cutlass_library_conv2d_sm50_swgrad_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/f16_s884dgrad_optimized_f16/all_sm70_f16_s884dgrad_optimized_f16_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/f16_s884dgrad_optimized_f16/cutlass_tensorop_f16_s884dgrad_optimized_f16_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 54%] Built target cutlass_library_conv2d_sm60_hfprop_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884fprop_optimized_f16_objs.dir/generated/conv2d/70/f16_s884fprop_optimized_f16/all_sm70_f16_s884fprop_optimized_f16_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/f16_s884dgrad_optimized_f16/cutlass_tensorop_f16_s884dgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Built target cutlass_library_gemm_sm90_z1684gemm_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884wgrad_optimized_f16_objs.dir/generated/conv2d/70/f16_s884wgrad_optimized_f16/all_sm70_f16_s884wgrad_optimized_f16_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884fprop_optimized_f16_objs.dir/generated/conv2d/70/f16_s884fprop_optimized_f16/cutlass_tensorop_f16_s884fprop_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_f16_s884wgrad_optimized_f16_objs.dir/generated/conv2d/70/f16_s884wgrad_optimized_f16/cutlass_tensorop_f16_s884wgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884dgrad_optimized_objs.dir/generated/conv2d/70/h884dgrad_optimized/all_sm70_h884dgrad_optimized_conv2d_operations.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884fprop_optimized_objs.dir/generated/conv2d/70/h884fprop_optimized/all_sm70_h884fprop_optimized_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884dgrad_optimized_objs.dir/generated/conv2d/70/h884dgrad_optimized/cutlass_tensorop_h884dgrad_optimized_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_f16_s884fprop_optimized_f16_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884wgrad_optimized_objs.dir/generated/conv2d/70/h884wgrad_optimized/all_sm70_h884wgrad_optimized_conv2d_operations.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_f16_s884wgrad_optimized_f16_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/s884dgrad_optimized_f16/all_sm70_s884dgrad_optimized_f16_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884fprop_optimized_objs.dir/generated/conv2d/70/h884fprop_optimized/cutlass_tensorop_h884fprop_optimized_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884wgrad_optimized_objs.dir/generated/conv2d/70/h884wgrad_optimized/cutlass_tensorop_h884wgrad_optimized_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/s884dgrad_optimized_f16/cutlass_tensorop_s884dgrad_optimized_f16_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_h884dgrad_optimized_objs.dir/generated/conv2d/70/h884dgrad_optimized/cutlass_tensorop_h884dgrad_optimized_256x128_32x2_nhwc_align8.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_h884fprop_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884fprop_optimized_f16_objs.dir/generated/conv2d/70/s884fprop_optimized_f16/all_sm70_s884fprop_optimized_f16_conv2d_operations.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_h884wgrad_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884wgrad_optimized_f16_objs.dir/generated/conv2d/70/s884wgrad_optimized_f16/all_sm70_s884wgrad_optimized_f16_conv2d_operations.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884dgrad_optimized_f16_objs.dir/generated/conv2d/70/s884dgrad_optimized_f16/cutlass_tensorop_s884dgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884fprop_optimized_f16_objs.dir/generated/conv2d/70/s884fprop_optimized_f16/cutlass_tensorop_s884fprop_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm70_s884wgrad_optimized_f16_objs.dir/generated/conv2d/70/s884wgrad_optimized_f16/cutlass_tensorop_s884wgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 54%] Built target cutlass_library_conv2d_sm70_h884dgrad_optimized_objs [ 54%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cdgrad_optimized_cf32/all_sm75_cf32_cdgrad_optimized_cf32_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cdgrad_optimized_cf32/cutlass_simt_cf32_cdgrad_optimized_cf32_128x128_8x5_nhwc_unity_stride_align1.cu.o [ 55%] Built target cutlass_library_conv2d_sm70_s884fprop_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cwgrad_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cwgrad_optimized_cf32/all_sm75_cf32_cwgrad_optimized_cf32_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm70_s884dgrad_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688dgrad_optimized_f16/all_sm75_f16_s1688dgrad_optimized_f16_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm70_s884wgrad_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_few_channels_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_few_channels_f16/all_sm75_f16_s1688fprop_few_channels_f16_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cwgrad_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cwgrad_optimized_cf32/cutlass_simt_cf32_cwgrad_optimized_cf32_128x128_8x5_nhwc_align1.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688dgrad_optimized_f16/cutlass_tensorop_f16_s1688dgrad_optimized_f16_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_few_channels_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_few_channels_f16/cutlass_tensorop_f16_s1688fprop_few_channels_f16_128x64_32x2_nhwc_align1.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32_objs.dir/generated/conv2d/75/cf32_cdgrad_optimized_cf32/cutlass_simt_cf32_cdgrad_optimized_cf32_128x128_8x5_nhwc_align1.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688dgrad_optimized_f16/cutlass_tensorop_f16_s1688dgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_few_channels_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_fixed_channels_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_fixed_channels_f16/all_sm75_f16_s1688fprop_fixed_channels_f16_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_cf32_cwgrad_optimized_cf32_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_optimized_f16/all_sm75_f16_s1688fprop_optimized_f16_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_fixed_channels_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_fixed_channels_f16/cutlass_tensorop_f16_s1688fprop_fixed_channels_f16_128x64_32x2_nhwc_align4.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688fprop_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688fprop_optimized_f16/cutlass_tensorop_f16_s1688fprop_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688wgrad_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688wgrad_optimized_f16/all_sm75_f16_s1688wgrad_optimized_f16_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688dgrad_optimized_objs.dir/generated/conv2d/75/h1688dgrad_optimized/all_sm75_h1688dgrad_optimized_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_f16_s1688wgrad_optimized_f16_objs.dir/generated/conv2d/75/f16_s1688wgrad_optimized_f16/cutlass_tensorop_f16_s1688wgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_fixed_channels_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_few_channels_objs.dir/generated/conv2d/75/h1688fprop_few_channels/all_sm75_h1688fprop_few_channels_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688dgrad_optimized_objs.dir/generated/conv2d/75/h1688dgrad_optimized/cutlass_tensorop_h1688dgrad_optimized_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_fixed_channels_objs.dir/generated/conv2d/75/h1688fprop_fixed_channels/all_sm75_h1688fprop_fixed_channels_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_few_channels_objs.dir/generated/conv2d/75/h1688fprop_few_channels/cutlass_tensorop_h1688fprop_few_channels_128x64_32x2_nhwc_align1.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_fixed_channels_objs.dir/generated/conv2d/75/h1688fprop_fixed_channels/cutlass_tensorop_h1688fprop_fixed_channels_128x64_32x2_nhwc_align4.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_f16_s1688wgrad_optimized_f16_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_optimized_objs.dir/generated/conv2d/75/h1688fprop_optimized/all_sm75_h1688fprop_optimized_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688dgrad_optimized_objs.dir/generated/conv2d/75/h1688dgrad_optimized/cutlass_tensorop_h1688dgrad_optimized_256x128_32x2_nhwc_align8.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688fprop_optimized_objs.dir/generated/conv2d/75/h1688fprop_optimized/cutlass_tensorop_h1688fprop_optimized_256x128_32x2_nhwc_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_h1688fprop_fixed_channels_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688wgrad_optimized_objs.dir/generated/conv2d/75/h1688wgrad_optimized/all_sm75_h1688wgrad_optimized_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_h1688fprop_few_channels_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8816fprop_optimized_s8_objs.dir/generated/conv2d/75/i8816fprop_optimized_s8/all_sm75_i8816fprop_optimized_s8_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_h1688wgrad_optimized_objs.dir/generated/conv2d/75/h1688wgrad_optimized/cutlass_tensorop_h1688wgrad_optimized_256x128_32x2_nhwc_align8.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8816fprop_optimized_s8_objs.dir/generated/conv2d/75/i8816fprop_optimized_s8/cutlass_tensorop_i8816fprop_optimized_s8_256x128_64x2_nhwc_align16.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_h1688dgrad_optimized_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8816fprop_optimized_u8_objs.dir/generated/conv2d/75/i8816fprop_optimized_u8/all_sm75_i8816fprop_optimized_u8_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_h1688fprop_optimized_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8832fprop_optimized_s4_objs.dir/generated/conv2d/75/i8832fprop_optimized_s4/all_sm75_i8832fprop_optimized_s4_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8816fprop_optimized_u8_objs.dir/generated/conv2d/75/i8816fprop_optimized_u8/cutlass_tensorop_i8816fprop_optimized_u8_256x128_64x2_nhwc_align16.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8832fprop_optimized_s4_objs.dir/generated/conv2d/75/i8832fprop_optimized_s4/cutlass_tensorop_i8832fprop_optimized_s4_256x128_128x2_nhwc_align32.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_h1688wgrad_optimized_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8832fprop_optimized_u4_objs.dir/generated/conv2d/75/i8832fprop_optimized_u4/all_sm75_i8832fprop_optimized_u4_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_s8_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/s1688dgrad_optimized_f16/all_sm75_s1688dgrad_optimized_f16_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_i8832fprop_optimized_u4_objs.dir/generated/conv2d/75/i8832fprop_optimized_u4/cutlass_tensorop_i8832fprop_optimized_u4_256x128_128x2_nhwc_align32.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/s1688dgrad_optimized_f16/cutlass_tensorop_s1688dgrad_optimized_f16_256x128_32x2_nhwc_unity_stride_align8.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_u8_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_few_channels_f16_objs.dir/generated/conv2d/75/s1688fprop_few_channels_f16/all_sm75_s1688fprop_few_channels_f16_conv2d_operations.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_s4_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_fixed_channels_f16_objs.dir/generated/conv2d/75/s1688fprop_fixed_channels_f16/all_sm75_s1688fprop_fixed_channels_f16_conv2d_operations.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_few_channels_f16_objs.dir/generated/conv2d/75/s1688fprop_few_channels_f16/cutlass_tensorop_s1688fprop_few_channels_f16_128x64_32x2_nhwc_align1.cu.o [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_fixed_channels_f16_objs.dir/generated/conv2d/75/s1688fprop_fixed_channels_f16/cutlass_tensorop_s1688fprop_fixed_channels_f16_128x64_32x2_nhwc_align4.cu.o [ 55%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_u4_objs [ 55%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_optimized_f16_objs.dir/generated/conv2d/75/s1688fprop_optimized_f16/all_sm75_s1688fprop_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16_objs.dir/generated/conv2d/75/s1688dgrad_optimized_f16/cutlass_tensorop_s1688dgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688fprop_optimized_f16_objs.dir/generated/conv2d/75/s1688fprop_optimized_f16/cutlass_tensorop_s1688fprop_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s1688fprop_fixed_channels_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688wgrad_optimized_f16_objs.dir/generated/conv2d/75/s1688wgrad_optimized_f16/all_sm75_s1688wgrad_optimized_f16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s1688fprop_few_channels_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4_objs.dir/generated/conv2d/75/s4_i8832fprop_optimized_s4/all_sm75_s4_i8832fprop_optimized_s4_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s1688wgrad_optimized_f16_objs.dir/generated/conv2d/75/s1688wgrad_optimized_f16/cutlass_tensorop_s1688wgrad_optimized_f16_256x128_32x2_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_few_channels_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_few_channels_s8/all_sm75_s8_i8816fprop_few_channels_s8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4_objs.dir/generated/conv2d/75/s4_i8832fprop_optimized_s4/cutlass_tensorop_s4_i8832fprop_optimized_s4_256x128_128x2_nhwc_align32.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s1688fprop_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_fixed_channels_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_fixed_channels_s8/all_sm75_s8_i8816fprop_fixed_channels_s8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_few_channels_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_few_channels_s8/cutlass_tensorop_s8_i8816fprop_few_channels_s8_256x128_64x2_nhwc_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_fixed_channels_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_fixed_channels_s8/cutlass_tensorop_s8_i8816fprop_fixed_channels_s8_256x128_64x2_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s1688wgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_optimized_s8/all_sm75_s8_i8816fprop_optimized_s8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4_objs.dir/generated/conv2d/75/s4_i8832fprop_optimized_s4/cutlass_tensorop_s4_i8832fprop_optimized_s4_256x128_128x2_nc64hw64_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_optimized_s8/cutlass_tensorop_s8_i8816fprop_optimized_s8_256x128_64x2_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_few_channels_s8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4_objs.dir/generated/conv2d/75/u4_i8832fprop_optimized_u4/all_sm75_u4_i8832fprop_optimized_u4_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_fixed_channels_s8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_few_channels_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_few_channels_u8/all_sm75_u8_i8816fprop_few_channels_u8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4_objs.dir/generated/conv2d/75/u4_i8832fprop_optimized_u4/cutlass_tensorop_u4_i8832fprop_optimized_u4_256x128_128x2_nhwc_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_few_channels_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_few_channels_u8/cutlass_tensorop_u8_i8816fprop_few_channels_u8_256x128_64x2_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_fixed_channels_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_fixed_channels_u8/all_sm75_u8_i8816fprop_fixed_channels_u8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8_objs.dir/generated/conv2d/75/s8_i8816fprop_optimized_s8/cutlass_tensorop_s8_i8816fprop_optimized_s8_256x128_64x2_nc32hw32_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_fixed_channels_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_fixed_channels_u8/cutlass_tensorop_u8_i8816fprop_fixed_channels_u8_256x128_64x2_nhwc_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4_objs.dir/generated/conv2d/75/u4_i8832fprop_optimized_u4/cutlass_tensorop_u4_i8832fprop_optimized_u4_256x128_128x2_nc64hw64_align32.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_few_channels_u8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_optimized_u8/all_sm75_u8_i8816fprop_optimized_u8_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816dgrad_optimized_bf16/all_sm80_bf16_s16816dgrad_optimized_bf16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_optimized_u8/cutlass_tensorop_u8_i8816fprop_optimized_u8_256x128_64x2_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_fixed_channels_u8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16_objs.dir/generated/conv2d/80/bf16_s16816fprop_fixed_channels_bf16/all_sm80_bf16_s16816fprop_fixed_channels_bf16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816dgrad_optimized_bf16/cutlass_tensorop_bf16_s16816dgrad_optimized_bf16_256x128_32x3_nhwc_unity_stride_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16_objs.dir/generated/conv2d/80/bf16_s16816fprop_fixed_channels_bf16/cutlass_tensorop_bf16_s16816fprop_fixed_channels_bf16_256x128_32x3_nhwc_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816fprop_optimized_bf16/all_sm80_bf16_s16816fprop_optimized_bf16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816fprop_optimized_bf16/cutlass_tensorop_bf16_s16816fprop_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8_objs.dir/generated/conv2d/75/u8_i8816fprop_optimized_u8/cutlass_tensorop_u8_i8816fprop_optimized_u8_256x128_64x2_nc32hw32_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816dgrad_optimized_bf16/cutlass_tensorop_bf16_s16816dgrad_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816wgrad_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816wgrad_optimized_bf16/all_sm80_bf16_s16816wgrad_optimized_bf16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816dgrad_optimized_f16/all_sm80_f16_s16816dgrad_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816fprop_optimized_bf16/cutlass_tensorop_bf16_s16816fprop_optimized_bf16_256x128_32x3_nhwc_single_group_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_bf16_s16816wgrad_optimized_bf16_objs.dir/generated/conv2d/80/bf16_s16816wgrad_optimized_bf16/cutlass_tensorop_bf16_s16816wgrad_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816dgrad_optimized_f16/cutlass_tensorop_f16_s16816dgrad_optimized_f16_256x128_32x3_nhwc_unity_stride_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816fprop_fixed_channels_f16_objs.dir/generated/conv2d/80/f16_s16816fprop_fixed_channels_f16/all_sm80_f16_s16816fprop_fixed_channels_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816fprop_fixed_channels_f16_objs.dir/generated/conv2d/80/f16_s16816fprop_fixed_channels_f16/cutlass_tensorop_f16_s16816fprop_fixed_channels_f16_256x128_32x3_nhwc_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816fprop_optimized_f16/all_sm80_f16_s16816fprop_optimized_f16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_bf16_s16816wgrad_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816wgrad_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816wgrad_optimized_f16/all_sm80_f16_s16816wgrad_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816dgrad_optimized_f16/cutlass_tensorop_f16_s16816dgrad_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816fprop_optimized_f16/cutlass_tensorop_f16_s16816fprop_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816wgrad_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816wgrad_optimized_f16/cutlass_tensorop_f16_s16816wgrad_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_fixed_channels_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816dgrad_optimized_objs.dir/generated/conv2d/80/h16816dgrad_optimized/all_sm80_h16816dgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816dgrad_optimized_objs.dir/generated/conv2d/80/h16816dgrad_optimized/cutlass_tensorop_h16816dgrad_optimized_256x128_32x3_nhwc_unity_stride_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816fprop_fixed_channels_objs.dir/generated/conv2d/80/h16816fprop_fixed_channels/all_sm80_h16816fprop_fixed_channels_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/f16_s16816fprop_optimized_f16/cutlass_tensorop_f16_s16816fprop_optimized_f16_256x128_32x3_nhwc_single_group_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_f16_s16816wgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816fprop_optimized_objs.dir/generated/conv2d/80/h16816fprop_optimized/all_sm80_h16816fprop_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816fprop_fixed_channels_objs.dir/generated/conv2d/80/h16816fprop_fixed_channels/cutlass_tensorop_h16816fprop_fixed_channels_256x128_32x3_nhwc_align4.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816fprop_optimized_objs.dir/generated/conv2d/80/h16816fprop_optimized/cutlass_tensorop_h16816fprop_optimized_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816dgrad_optimized_objs.dir/generated/conv2d/80/h16816dgrad_optimized/cutlass_tensorop_h16816dgrad_optimized_256x128_32x3_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816wgrad_optimized_objs.dir/generated/conv2d/80/h16816wgrad_optimized/all_sm80_h16816wgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816wgrad_optimized_objs.dir/generated/conv2d/80/h16816wgrad_optimized/cutlass_tensorop_h16816wgrad_optimized_256x128_32x3_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_h16816fprop_fixed_channels_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/i16832fprop_optimized_s8/all_sm80_i16832fprop_optimized_s8_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_h16816fprop_optimized_objs.dir/generated/conv2d/80/h16816fprop_optimized/cutlass_tensorop_h16816fprop_optimized_256x128_32x3_nhwc_single_group_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/i16832fprop_optimized_s8/cutlass_tensorop_i16832fprop_optimized_s8_256x128_64x3_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_h16816dgrad_optimized_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/i16832fprop_optimized_u8/all_sm80_i16832fprop_optimized_u8_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_h16816wgrad_optimized_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/i16864fprop_optimized_s4/all_sm80_i16864fprop_optimized_s4_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/i16832fprop_optimized_u8/cutlass_tensorop_i16832fprop_optimized_u8_256x128_64x3_nhwc_align16.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_h16816fprop_optimized_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/i16864fprop_optimized_s4/cutlass_tensorop_i16864fprop_optimized_s4_256x128_128x3_nhwc_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/i16864fprop_optimized_s4/cutlass_tensorop_i16864fprop_optimized_s4_256x128_128x3_nhwc_single_group_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/i16832fprop_optimized_s8/cutlass_tensorop_i16832fprop_optimized_s8_256x128_64x3_nhwc_single_group_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/i16832fprop_optimized_u8/cutlass_tensorop_i16832fprop_optimized_u8_256x128_64x3_nhwc_single_group_align16.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/i16864fprop_optimized_u4/all_sm80_i16864fprop_optimized_u4_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_s4_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_bf16/all_sm80_s16816dgrad_optimized_bf16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_s8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_f16/all_sm80_s16816dgrad_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/i16864fprop_optimized_u4/cutlass_tensorop_i16864fprop_optimized_u4_256x128_128x3_nhwc_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_bf16/cutlass_tensorop_s16816dgrad_optimized_bf16_256x128_32x3_nhwc_unity_stride_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_f16/cutlass_tensorop_s16816dgrad_optimized_f16_256x128_32x3_nhwc_unity_stride_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_u8_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_bf16_objs.dir/generated/conv2d/80/s16816fprop_fixed_channels_bf16/all_sm80_s16816fprop_fixed_channels_bf16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_bf16_objs.dir/generated/conv2d/80/s16816fprop_fixed_channels_bf16/cutlass_tensorop_s16816fprop_fixed_channels_bf16_256x128_32x3_nhwc_align4.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/i16864fprop_optimized_u4/cutlass_tensorop_i16864fprop_optimized_u4_256x128_128x3_nhwc_single_group_align32.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_bf16/cutlass_tensorop_s16816dgrad_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16_objs.dir/generated/conv2d/80/s16816dgrad_optimized_f16/cutlass_tensorop_s16816dgrad_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_f16_objs.dir/generated/conv2d/80/s16816fprop_fixed_channels_f16/all_sm80_s16816fprop_fixed_channels_f16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_u4_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/s16816fprop_optimized_bf16/all_sm80_s16816fprop_optimized_bf16_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/s16816fprop_optimized_f16/all_sm80_s16816fprop_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_f16_objs.dir/generated/conv2d/80/s16816fprop_fixed_channels_f16/cutlass_tensorop_s16816fprop_fixed_channels_f16_256x128_32x3_nhwc_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816wgrad_optimized_bf16_objs.dir/generated/conv2d/80/s16816wgrad_optimized_bf16/all_sm80_s16816wgrad_optimized_bf16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/s16816fprop_optimized_bf16/cutlass_tensorop_s16816fprop_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/s16816fprop_optimized_f16/cutlass_tensorop_s16816fprop_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816wgrad_optimized_bf16_objs.dir/generated/conv2d/80/s16816wgrad_optimized_bf16/cutlass_tensorop_s16816wgrad_optimized_bf16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16_objs.dir/generated/conv2d/80/s16816fprop_optimized_bf16/cutlass_tensorop_s16816fprop_optimized_bf16_256x128_32x3_nhwc_single_group_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816wgrad_optimized_f16_objs.dir/generated/conv2d/80/s16816wgrad_optimized_f16/all_sm80_s16816wgrad_optimized_f16_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816fprop_optimized_f16_objs.dir/generated/conv2d/80/s16816fprop_optimized_f16/cutlass_tensorop_s16816fprop_optimized_f16_256x128_32x3_nhwc_single_group_align8.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized_objs.dir/generated/conv2d/80/s1688bf16dgrad_optimized/all_sm80_s1688bf16dgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s16816wgrad_optimized_f16_objs.dir/generated/conv2d/80/s16816wgrad_optimized_f16/cutlass_tensorop_s16816wgrad_optimized_f16_256x128_32x3_nhwc_align8.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized_objs.dir/generated/conv2d/80/s1688bf16dgrad_optimized/cutlass_tensorop_s1688bf16dgrad_optimized_256x128_16x3_nhwc_unity_stride_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16fprop_optimized_objs.dir/generated/conv2d/80/s1688bf16fprop_optimized/all_sm80_s1688bf16fprop_optimized_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16wgrad_optimized_objs.dir/generated/conv2d/80/s1688bf16wgrad_optimized/all_sm80_s1688bf16wgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16fprop_optimized_objs.dir/generated/conv2d/80/s1688bf16fprop_optimized/cutlass_tensorop_s1688bf16fprop_optimized_256x128_16x3_nhwc_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_f16_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16wgrad_optimized_objs.dir/generated/conv2d/80/s1688bf16wgrad_optimized/cutlass_tensorop_s1688bf16wgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_objs.dir/generated/conv2d/80/s1688dgrad_optimized/all_sm80_s1688dgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized_objs.dir/generated/conv2d/80/s1688bf16dgrad_optimized/cutlass_tensorop_s1688bf16dgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_objs.dir/generated/conv2d/80/s1688dgrad_optimized/cutlass_tensorop_s1688dgrad_optimized_128x128_16x4_nhwc_unity_stride_align4.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688bf16fprop_optimized_objs.dir/generated/conv2d/80/s1688bf16fprop_optimized/cutlass_tensorop_s1688bf16fprop_optimized_256x128_16x3_nhwc_single_group_align4.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s1688bf16wgrad_optimized_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/s1688dgrad_optimized_tf32/all_sm80_s1688dgrad_optimized_tf32_conv2d_operations.cu.o [ 56%] Built target cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized_objs [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16dgrad_optimized_objs.dir/generated/conv2d/80/s1688f16dgrad_optimized/all_sm80_s1688f16dgrad_optimized_conv2d_operations.cu.o [ 56%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/s1688dgrad_optimized_tf32/cutlass_tensorop_s1688dgrad_optimized_tf32_256x128_16x3_nhwc_unity_stride_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_objs.dir/generated/conv2d/80/s1688dgrad_optimized/cutlass_tensorop_s1688dgrad_optimized_128x128_16x4_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16dgrad_optimized_objs.dir/generated/conv2d/80/s1688f16dgrad_optimized/cutlass_tensorop_s1688f16dgrad_optimized_256x128_16x3_nhwc_unity_stride_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688bf16fprop_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16fprop_optimized_objs.dir/generated/conv2d/80/s1688f16fprop_optimized/all_sm80_s1688f16fprop_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16fprop_optimized_objs.dir/generated/conv2d/80/s1688f16fprop_optimized/cutlass_tensorop_s1688f16fprop_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/s1688dgrad_optimized_tf32/cutlass_tensorop_s1688dgrad_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16wgrad_optimized_objs.dir/generated/conv2d/80/s1688f16wgrad_optimized/all_sm80_s1688f16wgrad_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16dgrad_optimized_objs.dir/generated/conv2d/80/s1688f16dgrad_optimized/cutlass_tensorop_s1688f16dgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16wgrad_optimized_objs.dir/generated/conv2d/80/s1688f16wgrad_optimized/cutlass_tensorop_s1688f16wgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688f16fprop_optimized_objs.dir/generated/conv2d/80/s1688f16fprop_optimized/cutlass_tensorop_s1688f16fprop_optimized_256x128_16x3_nhwc_single_group_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_objs.dir/generated/conv2d/80/s1688fprop_optimized/all_sm80_s1688fprop_optimized_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688f16wgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/s1688fprop_optimized_tf32/all_sm80_s1688fprop_optimized_tf32_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688f16dgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized_objs.dir/generated/conv2d/80/s1688tf32dgrad_optimized/all_sm80_s1688tf32dgrad_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_objs.dir/generated/conv2d/80/s1688fprop_optimized/cutlass_tensorop_s1688fprop_optimized_128x128_16x4_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/s1688fprop_optimized_tf32/cutlass_tensorop_s1688fprop_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized_objs.dir/generated/conv2d/80/s1688tf32dgrad_optimized/cutlass_tensorop_s1688tf32dgrad_optimized_256x128_16x3_nhwc_unity_stride_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688f16fprop_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32fprop_optimized_objs.dir/generated/conv2d/80/s1688tf32fprop_optimized/all_sm80_s1688tf32fprop_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32fprop_optimized_objs.dir/generated/conv2d/80/s1688tf32fprop_optimized/cutlass_tensorop_s1688tf32fprop_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_objs.dir/generated/conv2d/80/s1688fprop_optimized/cutlass_tensorop_s1688fprop_optimized_128x128_16x4_nhwc_single_group_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/s1688fprop_optimized_tf32/cutlass_tensorop_s1688fprop_optimized_tf32_256x128_16x3_nhwc_single_group_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized_objs.dir/generated/conv2d/80/s1688tf32dgrad_optimized/cutlass_tensorop_s1688tf32dgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32fprop_optimized_objs.dir/generated/conv2d/80/s1688tf32fprop_optimized/cutlass_tensorop_s1688tf32fprop_optimized_256x128_16x3_nhwc_single_group_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32wgrad_optimized_objs.dir/generated/conv2d/80/s1688tf32wgrad_optimized/all_sm80_s1688tf32wgrad_optimized_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688wgrad_optimized_objs.dir/generated/conv2d/80/s1688wgrad_optimized/all_sm80_s1688wgrad_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688tf32wgrad_optimized_objs.dir/generated/conv2d/80/s1688tf32wgrad_optimized/cutlass_tensorop_s1688tf32wgrad_optimized_256x128_16x3_nhwc_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688wgrad_optimized_tf32_objs.dir/generated/conv2d/80/s1688wgrad_optimized_tf32/all_sm80_s1688wgrad_optimized_tf32_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688wgrad_optimized_objs.dir/generated/conv2d/80/s1688wgrad_optimized/cutlass_tensorop_s1688wgrad_optimized_128x128_16x4_nhwc_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688tf32fprop_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/s4_i16864fprop_optimized_s4/all_sm80_s4_i16864fprop_optimized_s4_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s1688wgrad_optimized_tf32_objs.dir/generated/conv2d/80/s1688wgrad_optimized_tf32/cutlass_tensorop_s1688wgrad_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/s4_i16864fprop_optimized_s4/cutlass_tensorop_s4_i16864fprop_optimized_s4_256x128_128x3_nhwc_align32.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688tf32wgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_few_channels_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_few_channels_s8/all_sm80_s8_i16832fprop_few_channels_s8_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_fixed_channels_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_fixed_channels_s8/all_sm80_s8_i16832fprop_fixed_channels_s8_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_few_channels_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_few_channels_s8/cutlass_tensorop_s8_i16832fprop_few_channels_s8_256x128_64x3_nhwc_align16.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_optimized_s8/all_sm80_s8_i16832fprop_optimized_s8_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_fixed_channels_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_fixed_channels_s8/cutlass_tensorop_s8_i16832fprop_fixed_channels_s8_256x128_64x3_nhwc_align16.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/s4_i16864fprop_optimized_s4/cutlass_tensorop_s4_i16864fprop_optimized_s4_256x128_128x3_nhwc_single_group_align32.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_optimized_s8/cutlass_tensorop_s8_i16832fprop_optimized_s8_256x128_64x3_nhwc_align16.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_few_channels_s8_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_sdgrad_optimized_objs.dir/generated/conv2d/80/sdgrad_optimized/all_sm80_sdgrad_optimized_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_fixed_channels_s8_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_sfprop_optimized_objs.dir/generated/conv2d/80/sfprop_optimized/all_sm80_sfprop_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_sdgrad_optimized_objs.dir/generated/conv2d/80/sdgrad_optimized/cutlass_simt_sdgrad_optimized_256x128_8x5_nhwc_unity_stride_align1.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_objs.dir/generated/conv2d/80/s4_i16864fprop_optimized_s4/cutlass_tensorop_s4_i16864fprop_optimized_s4_256x128_128x3_nc64hw64_align32.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_optimized_s8/cutlass_tensorop_s8_i16832fprop_optimized_s8_256x128_64x3_nhwc_single_group_align16.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_sfprop_optimized_objs.dir/generated/conv2d/80/sfprop_optimized/cutlass_simt_sfprop_optimized_256x128_8x5_nhwc_align1.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_swgrad_optimized_objs.dir/generated/conv2d/80/swgrad_optimized/all_sm80_swgrad_optimized_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_objs.dir/generated/conv2d/80/s8_i16832fprop_optimized_s8/cutlass_tensorop_s8_i16832fprop_optimized_s8_256x128_64x3_nc32hw32_align16.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_swgrad_optimized_objs.dir/generated/conv2d/80/swgrad_optimized/cutlass_simt_swgrad_optimized_256x128_8x5_nhwc_align1.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_sdgrad_optimized_objs.dir/generated/conv2d/80/sdgrad_optimized/cutlass_simt_sdgrad_optimized_256x128_8x5_nhwc_align1.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_sfprop_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688dgrad_optimized_tf32/all_sm80_tf32_s1688dgrad_optimized_tf32_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688fprop_optimized_tf32/all_sm80_tf32_s1688fprop_optimized_tf32_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688dgrad_optimized_tf32/cutlass_tensorop_tf32_s1688dgrad_optimized_tf32_256x128_16x3_nhwc_unity_stride_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688fprop_optimized_tf32/cutlass_tensorop_tf32_s1688fprop_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_swgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688wgrad_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688wgrad_optimized_tf32/all_sm80_tf32_s1688wgrad_optimized_tf32_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688dgrad_optimized_tf32/cutlass_tensorop_tf32_s1688dgrad_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_sdgrad_optimized_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/u4_i16864fprop_optimized_u4/all_sm80_u4_i16864fprop_optimized_u4_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688wgrad_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688wgrad_optimized_tf32/cutlass_tensorop_tf32_s1688wgrad_optimized_tf32_256x128_16x3_nhwc_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32_objs.dir/generated/conv2d/80/tf32_s1688fprop_optimized_tf32/cutlass_tensorop_tf32_s1688fprop_optimized_tf32_256x128_16x3_nhwc_single_group_align4.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/u4_i16864fprop_optimized_u4/cutlass_tensorop_u4_i16864fprop_optimized_u4_256x128_128x3_nhwc_align32.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_tf32_s1688wgrad_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_few_channels_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_few_channels_u8/all_sm80_u8_i16832fprop_few_channels_u8_conv2d_operations.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_fixed_channels_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_fixed_channels_u8/all_sm80_u8_i16832fprop_fixed_channels_u8_conv2d_operations.cu.o [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/u4_i16864fprop_optimized_u4/cutlass_tensorop_u4_i16864fprop_optimized_u4_256x128_128x3_nhwc_single_group_align32.cu.o [ 57%] Built target cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32_objs [ 57%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_fixed_channels_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_fixed_channels_u8/cutlass_tensorop_u8_i16832fprop_fixed_channels_u8_256x128_64x3_nhwc_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_few_channels_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_few_channels_u8/cutlass_tensorop_u8_i16832fprop_few_channels_u8_256x128_64x3_nhwc_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_optimized_u8/all_sm80_u8_i16832fprop_optimized_u8_conv2d_operations.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_optimized_u8/cutlass_tensorop_u8_i16832fprop_optimized_u8_256x128_64x3_nhwc_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_objs.dir/generated/conv2d/80/u4_i16864fprop_optimized_u4/cutlass_tensorop_u4_i16864fprop_optimized_u4_256x128_128x3_nc64hw64_align32.cu.o [ 58%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_fixed_channels_u8_objs [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e4m3_objs.dir/generated/conv2d/89/s16832fprop_fixed_channels_e4m3/all_sm89_s16832fprop_fixed_channels_e4m3_conv2d_operations.cu.o [ 58%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_few_channels_u8_objs [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e5m2_objs.dir/generated/conv2d/89/s16832fprop_fixed_channels_e5m2/all_sm89_s16832fprop_fixed_channels_e5m2_conv2d_operations.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e4m3_objs.dir/generated/conv2d/89/s16832fprop_fixed_channels_e4m3/cutlass_tensorop_s16832fprop_fixed_channels_e4m3_256x128_64x3_nhwc_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_optimized_u8/cutlass_tensorop_u8_i16832fprop_optimized_u8_256x128_64x3_nhwc_single_group_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e5m2_objs.dir/generated/conv2d/89/s16832fprop_fixed_channels_e5m2/cutlass_tensorop_s16832fprop_fixed_channels_e5m2_256x128_64x3_nhwc_align16.cu.o [ 58%] Built target cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_objs [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3_objs.dir/generated/conv2d/89/s16832fprop_optimized_e4m3/all_sm89_s16832fprop_optimized_e4m3_conv2d_operations.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3_objs.dir/generated/conv2d/89/s16832fprop_optimized_e4m3/cutlass_tensorop_s16832fprop_optimized_e4m3_256x128_64x3_nhwc_align16.cu.o [ 58%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e4m3_objs [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2_objs.dir/generated/conv2d/89/s16832fprop_optimized_e5m2/all_sm89_s16832fprop_optimized_e5m2_conv2d_operations.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_objs.dir/generated/conv2d/80/u8_i16832fprop_optimized_u8/cutlass_tensorop_u8_i16832fprop_optimized_u8_256x128_64x3_nc32hw32_align16.cu.o [ 58%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e5m2_objs [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16/all_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_conv2d_operations.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2_objs.dir/generated/conv2d/89/s16832fprop_optimized_e5m2/cutlass_tensorop_s16832fprop_optimized_e5m2_256x128_64x3_nhwc_align16.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 58%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3_objs.dir/generated/conv2d/89/s16832fprop_optimized_e4m3/cutlass_tensorop_s16832fprop_optimized_e4m3_256x128_64x3_nhwc_single_group_align16.cu.o [ 58%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16/all_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_conv2d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2_objs.dir/generated/conv2d/89/s16832fprop_optimized_e5m2/cutlass_tensorop_s16832fprop_optimized_e5m2_256x128_64x3_nhwc_single_group_align16.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32/all_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_conv2d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32/all_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_conv2d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32/all_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_conv2d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_objs.dir/generated/conv2d/90/h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32/all_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_conv2d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16_objs.dir/generated/conv3d/80/bf16_s16816dgrad3d_analytic_bf16/all_sm80_bf16_s16816dgrad3d_analytic_bf16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16_objs.dir/generated/conv3d/80/bf16_s16816dgrad3d_analytic_bf16/cutlass_tensorop_bf16_s16816dgrad3d_analytic_bf16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816dgrad3d_optimized_bf16/all_sm80_bf16_s16816dgrad3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816dgrad3d_optimized_bf16/cutlass_tensorop_bf16_s16816dgrad3d_optimized_bf16_256x128_32x3_unity_stride.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816fprop3d_optimized_bf16/all_sm80_bf16_s16816fprop3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_objs.dir/generated/conv2d/90/s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816fprop3d_optimized_bf16/cutlass_tensorop_bf16_s16816fprop3d_optimized_bf16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816wgrad3d_optimized_bf16/all_sm80_bf16_s16816wgrad3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816dgrad3d_analytic_f16_objs.dir/generated/conv3d/80/f16_s16816dgrad3d_analytic_f16/all_sm80_f16_s16816dgrad3d_analytic_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/bf16_s16816wgrad3d_optimized_bf16/cutlass_tensorop_bf16_s16816wgrad3d_optimized_bf16_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816dgrad3d_analytic_f16_objs.dir/generated/conv3d/80/f16_s16816dgrad3d_analytic_f16/cutlass_tensorop_f16_s16816dgrad3d_analytic_f16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816dgrad3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816dgrad3d_optimized_f16/all_sm80_f16_s16816dgrad3d_optimized_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816dgrad3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816dgrad3d_optimized_f16/cutlass_tensorop_f16_s16816dgrad3d_optimized_f16_256x128_32x3_unity_stride.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816fprop3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816fprop3d_optimized_f16/all_sm80_f16_s16816fprop3d_optimized_f16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_analytic_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816wgrad3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816wgrad3d_optimized_f16/all_sm80_f16_s16816wgrad3d_optimized_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816fprop3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816fprop3d_optimized_f16/cutlass_tensorop_f16_s16816fprop3d_optimized_f16_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_f16_s16816wgrad3d_optimized_f16_objs.dir/generated/conv3d/80/f16_s16816wgrad3d_optimized_f16/cutlass_tensorop_f16_s16816wgrad3d_optimized_f16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816dgrad3d_analytic_objs.dir/generated/conv3d/80/h16816dgrad3d_analytic/all_sm80_h16816dgrad3d_analytic_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816dgrad3d_analytic_objs.dir/generated/conv3d/80/h16816dgrad3d_analytic/cutlass_tensorop_h16816dgrad3d_analytic_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_f16_s16816fprop3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816dgrad3d_optimized_objs.dir/generated/conv3d/80/h16816dgrad3d_optimized/all_sm80_h16816dgrad3d_optimized_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_f16_s16816wgrad3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816fprop3d_optimized_objs.dir/generated/conv3d/80/h16816fprop3d_optimized/all_sm80_h16816fprop3d_optimized_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816wgrad3d_optimized_objs.dir/generated/conv3d/80/h16816wgrad3d_optimized/all_sm80_h16816wgrad3d_optimized_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816dgrad3d_optimized_objs.dir/generated/conv3d/80/h16816dgrad3d_optimized/cutlass_tensorop_h16816dgrad3d_optimized_256x128_32x3_unity_stride.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816fprop3d_optimized_objs.dir/generated/conv3d/80/h16816fprop3d_optimized/cutlass_tensorop_h16816fprop3d_optimized_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_analytic_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_bf16_objs.dir/generated/conv3d/80/s16816dgrad3d_analytic_bf16/all_sm80_s16816dgrad3d_analytic_bf16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_h16816wgrad3d_optimized_objs.dir/generated/conv3d/80/h16816wgrad3d_optimized/cutlass_tensorop_h16816wgrad3d_optimized_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_bf16_objs.dir/generated/conv3d/80/s16816dgrad3d_analytic_bf16/cutlass_tensorop_s16816dgrad3d_analytic_bf16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_optimized_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_f16_objs.dir/generated/conv3d/80/s16816dgrad3d_analytic_f16/all_sm80_s16816dgrad3d_analytic_f16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_h16816fprop3d_optimized_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816dgrad3d_optimized_bf16/all_sm80_s16816dgrad3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_h16816wgrad3d_optimized_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_f16_objs.dir/generated/conv3d/80/s16816dgrad3d_optimized_f16/all_sm80_s16816dgrad3d_optimized_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_f16_objs.dir/generated/conv3d/80/s16816dgrad3d_analytic_f16/cutlass_tensorop_s16816dgrad3d_analytic_f16_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816dgrad3d_optimized_bf16/cutlass_tensorop_s16816dgrad3d_optimized_bf16_256x128_32x3_unity_stride.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816fprop3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816fprop3d_optimized_bf16/all_sm80_s16816fprop3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_f16_objs.dir/generated/conv3d/80/s16816dgrad3d_optimized_f16/cutlass_tensorop_s16816dgrad3d_optimized_f16_256x128_32x3_unity_stride.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816fprop3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816fprop3d_optimized_bf16/cutlass_tensorop_s16816fprop3d_optimized_bf16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816fprop3d_optimized_f16_objs.dir/generated/conv3d/80/s16816fprop3d_optimized_f16/all_sm80_s16816fprop3d_optimized_f16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816wgrad3d_optimized_bf16/all_sm80_s16816wgrad3d_optimized_bf16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_f16_objs.dir/generated/conv3d/80/s16816wgrad3d_optimized_f16/all_sm80_s16816wgrad3d_optimized_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816fprop3d_optimized_f16_objs.dir/generated/conv3d/80/s16816fprop3d_optimized_f16/cutlass_tensorop_s16816fprop3d_optimized_f16_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_bf16_objs.dir/generated/conv3d/80/s16816wgrad3d_optimized_bf16/cutlass_tensorop_s16816wgrad3d_optimized_bf16_256x128_32x3.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_f16_objs.dir/generated/conv3d/80/s16816wgrad3d_optimized_f16/cutlass_tensorop_s16816wgrad3d_optimized_f16_256x128_32x3.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16/all_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16/all_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_bf16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32/all_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_conv3d_operations.cu.o [ 59%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32/all_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32/all_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32/all_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_conv3d_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_64x64x64_2x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688herk_objs.dir/generated/rank_k/80/c1688herk/all_sm80_c1688herk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688herk_objs.dir/generated/rank_k/80/c1688herk/cutlass_tensorop_c1688herk_128x64_16x4_n_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688herk_objs.dir/generated/rank_k/80/c1688herk/cutlass_tensorop_c1688herk_128x64_16x4_n_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688herk_objs.dir/generated/rank_k/80/c1688herk/cutlass_tensorop_c1688herk_128x64_16x4_h_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688herk_objs.dir/generated/rank_k/80/c1688herk/cutlass_tensorop_c1688herk_128x64_16x4_h_u_align1.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688syrk_objs.dir/generated/rank_k/80/c1688syrk/all_sm80_c1688syrk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688syrk_objs.dir/generated/rank_k/80/c1688syrk/cutlass_tensorop_c1688syrk_128x64_16x4_n_l_align1.cu.o [ 59%] Built target cutlass_library_rank_k_sm80_c1688herk_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32herk_objs.dir/generated/rank_k/80/c1688tf32herk/all_sm80_c1688tf32herk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32herk_objs.dir/generated/rank_k/80/c1688tf32herk/cutlass_tensorop_c1688tf32herk_128x64_16x4_n_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688syrk_objs.dir/generated/rank_k/80/c1688syrk/cutlass_tensorop_c1688syrk_128x64_16x4_n_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs.dir/generated/conv3d/90/s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32/cutlass3x_sm90_tensorop_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32herk_objs.dir/generated/rank_k/80/c1688tf32herk/cutlass_tensorop_c1688tf32herk_128x64_16x4_n_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688syrk_objs.dir/generated/rank_k/80/c1688syrk/cutlass_tensorop_c1688syrk_128x64_16x4_t_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_objs.dir/generated/conv3d/90/h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16/cutlass3x_sm90_tensorop_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_64x64x64_1x1x1_0_align16_warpspecialized_epi_tma.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32herk_objs.dir/generated/rank_k/80/c1688tf32herk/cutlass_tensorop_c1688tf32herk_128x64_16x4_h_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688syrk_objs.dir/generated/rank_k/80/c1688syrk/cutlass_tensorop_c1688syrk_128x64_16x4_t_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32herk_objs.dir/generated/rank_k/80/c1688tf32herk/cutlass_tensorop_c1688tf32herk_128x64_16x4_h_u_align1.cu.o [ 59%] Built target cutlass_library_rank_k_sm80_c1688syrk_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32syrk_objs.dir/generated/rank_k/80/c1688tf32syrk/all_sm80_c1688tf32syrk_rank_k_operations.cu.o [ 59%] Built target cutlass_library_rank_k_sm80_c1688tf32herk_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_d884syrk_objs.dir/generated/rank_k/80/d884syrk/all_sm80_d884syrk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32syrk_objs.dir/generated/rank_k/80/c1688tf32syrk/cutlass_tensorop_c1688tf32syrk_128x64_16x4_n_l_align1.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884herk_objs.dir/generated/rank_k/80/gz884herk/all_sm80_gz884herk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_d884syrk_objs.dir/generated/rank_k/80/d884syrk/cutlass_tensorop_d884syrk_128x128_16x3_n_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884herk_objs.dir/generated/rank_k/80/gz884herk/cutlass_tensorop_gz884herk_64x64_8x3_n_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32syrk_objs.dir/generated/rank_k/80/c1688tf32syrk/cutlass_tensorop_c1688tf32syrk_128x64_16x4_n_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_d884syrk_objs.dir/generated/rank_k/80/d884syrk/cutlass_tensorop_d884syrk_128x128_16x3_n_u_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884herk_objs.dir/generated/rank_k/80/gz884herk/cutlass_tensorop_gz884herk_64x64_8x3_n_u_align1.cu.o [ 59%] Built target cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_objs [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884syrk_objs.dir/generated/rank_k/80/gz884syrk/all_sm80_gz884syrk_rank_k_operations.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32syrk_objs.dir/generated/rank_k/80/c1688tf32syrk/cutlass_tensorop_c1688tf32syrk_128x64_16x4_t_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884syrk_objs.dir/generated/rank_k/80/gz884syrk/cutlass_tensorop_gz884syrk_64x64_8x3_n_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884herk_objs.dir/generated/rank_k/80/gz884herk/cutlass_tensorop_gz884herk_64x64_8x3_h_l_align1.cu.o [ 59%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_d884syrk_objs.dir/generated/rank_k/80/d884syrk/cutlass_tensorop_d884syrk_128x128_16x3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_c1688tf32syrk_objs.dir/generated/rank_k/80/c1688tf32syrk/cutlass_tensorop_c1688tf32syrk_128x64_16x4_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884syrk_objs.dir/generated/rank_k/80/gz884syrk/cutlass_tensorop_gz884syrk_64x64_8x3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884herk_objs.dir/generated/rank_k/80/gz884herk/cutlass_tensorop_gz884herk_64x64_8x3_h_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_d884syrk_objs.dir/generated/rank_k/80/d884syrk/cutlass_tensorop_d884syrk_128x128_16x3_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884syrk_objs.dir/generated/rank_k/80/gz884syrk/cutlass_tensorop_gz884syrk_64x64_8x3_t_l_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_gz884herk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688syrk_objs.dir/generated/rank_k/80/s1688syrk/all_sm80_s1688syrk_rank_k_operations.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_c1688tf32syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688tf32syrk_objs.dir/generated/rank_k/80/s1688tf32syrk/all_sm80_s1688tf32syrk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688syrk_objs.dir/generated/rank_k/80/s1688syrk/cutlass_tensorop_s1688syrk_256x128_16x3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688tf32syrk_objs.dir/generated/rank_k/80/s1688tf32syrk/cutlass_tensorop_s1688tf32syrk_256x128_16x3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_gz884syrk_objs.dir/generated/rank_k/80/gz884syrk/cutlass_tensorop_gz884syrk_64x64_8x3_t_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_d884syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884herk_objs.dir/generated/rank_k/80/z884herk/all_sm80_z884herk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884herk_objs.dir/generated/rank_k/80/z884herk/cutlass_tensorop_z884herk_128x64_8x3_n_l_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_gz884syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884syrk_objs.dir/generated/rank_k/80/z884syrk/all_sm80_z884syrk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884syrk_objs.dir/generated/rank_k/80/z884syrk/cutlass_tensorop_z884syrk_128x64_8x3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688syrk_objs.dir/generated/rank_k/80/s1688syrk/cutlass_tensorop_s1688syrk_256x128_16x3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688tf32syrk_objs.dir/generated/rank_k/80/s1688tf32syrk/cutlass_tensorop_s1688tf32syrk_256x128_16x3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884herk_objs.dir/generated/rank_k/80/z884herk/cutlass_tensorop_z884herk_128x64_8x3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884syrk_objs.dir/generated/rank_k/80/z884syrk/cutlass_tensorop_z884syrk_128x64_8x3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884herk_objs.dir/generated/rank_k/80/z884herk/cutlass_tensorop_z884herk_128x64_8x3_h_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884syrk_objs.dir/generated/rank_k/80/z884syrk/cutlass_tensorop_z884syrk_128x64_8x3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688tf32syrk_objs.dir/generated/rank_k/80/s1688tf32syrk/cutlass_tensorop_s1688tf32syrk_256x128_16x3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688syrk_objs.dir/generated/rank_k/80/s1688syrk/cutlass_tensorop_s1688syrk_256x128_16x3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884herk_objs.dir/generated/rank_k/80/z884herk/cutlass_tensorop_z884herk_128x64_8x3_h_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_z884syrk_objs.dir/generated/rank_k/80/z884syrk/cutlass_tensorop_z884syrk_128x64_8x3_t_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_z884herk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_d1684syrk_objs.dir/generated/rank_k/90/d1684syrk/all_sm90_d1684syrk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688tf32syrk_objs.dir/generated/rank_k/80/s1688tf32syrk/cutlass_tensorop_s1688tf32syrk_256x128_16x3_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm80_s1688syrk_objs.dir/generated/rank_k/80/s1688syrk/cutlass_tensorop_s1688syrk_256x128_16x3_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_d1684syrk_objs.dir/generated/rank_k/90/d1684syrk/cutlass_tensorop_d1684syrk_128x128x16_1x1x1_3_n_l_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_z884syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684herk_objs.dir/generated/rank_k/90/gz1684herk/all_sm90_gz1684herk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684herk_objs.dir/generated/rank_k/90/gz1684herk/cutlass_tensorop_gz1684herk_64x64x8_1x1x1_3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_d1684syrk_objs.dir/generated/rank_k/90/d1684syrk/cutlass_tensorop_d1684syrk_128x128x16_1x1x1_3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684herk_objs.dir/generated/rank_k/90/gz1684herk/cutlass_tensorop_gz1684herk_64x64x8_1x1x1_3_n_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_s1688tf32syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684syrk_objs.dir/generated/rank_k/90/gz1684syrk/all_sm90_gz1684syrk_rank_k_operations.cu.o [ 60%] Built target cutlass_library_rank_k_sm80_s1688syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684herk_objs.dir/generated/rank_k/90/z1684herk/all_sm90_z1684herk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684syrk_objs.dir/generated/rank_k/90/gz1684syrk/cutlass_tensorop_gz1684syrk_64x64x8_1x1x1_3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684herk_objs.dir/generated/rank_k/90/z1684herk/cutlass_tensorop_z1684herk_128x64x8_1x1x1_3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684herk_objs.dir/generated/rank_k/90/gz1684herk/cutlass_tensorop_gz1684herk_64x64x8_1x1x1_3_h_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_d1684syrk_objs.dir/generated/rank_k/90/d1684syrk/cutlass_tensorop_d1684syrk_128x128x16_1x1x1_3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684syrk_objs.dir/generated/rank_k/90/gz1684syrk/cutlass_tensorop_gz1684syrk_64x64x8_1x1x1_3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684herk_objs.dir/generated/rank_k/90/gz1684herk/cutlass_tensorop_gz1684herk_64x64x8_1x1x1_3_h_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684herk_objs.dir/generated/rank_k/90/z1684herk/cutlass_tensorop_z1684herk_128x64x8_1x1x1_3_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_d1684syrk_objs.dir/generated/rank_k/90/d1684syrk/cutlass_tensorop_d1684syrk_128x128x16_1x1x1_3_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684syrk_objs.dir/generated/rank_k/90/gz1684syrk/cutlass_tensorop_gz1684syrk_64x64x8_1x1x1_3_t_l_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm90_gz1684herk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684syrk_objs.dir/generated/rank_k/90/z1684syrk/all_sm90_z1684syrk_rank_k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684herk_objs.dir/generated/rank_k/90/z1684herk/cutlass_tensorop_z1684herk_128x64x8_1x1x1_3_h_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684syrk_objs.dir/generated/rank_k/90/z1684syrk/cutlass_tensorop_z1684syrk_128x64x8_1x1x1_3_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_gz1684syrk_objs.dir/generated/rank_k/90/gz1684syrk/cutlass_tensorop_gz1684syrk_64x64x8_1x1x1_3_t_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm90_d1684syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688her2k_objs.dir/generated/rank_2k/80/c1688her2k/all_sm80_c1688her2k_rank_2k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688her2k_objs.dir/generated/rank_2k/80/c1688her2k/cutlass_tensorop_c1688her2k_128x64_16x4_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684herk_objs.dir/generated/rank_k/90/z1684herk/cutlass_tensorop_z1684herk_128x64x8_1x1x1_3_h_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684syrk_objs.dir/generated/rank_k/90/z1684syrk/cutlass_tensorop_z1684syrk_128x64x8_1x1x1_3_n_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm90_gz1684syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688syr2k_objs.dir/generated/rank_2k/80/c1688syr2k/all_sm80_c1688syr2k_rank_2k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688syr2k_objs.dir/generated/rank_2k/80/c1688syr2k/cutlass_tensorop_c1688syr2k_128x64_16x4_n_l_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm90_z1684herk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32her2k_objs.dir/generated/rank_2k/80/c1688tf32her2k/all_sm80_c1688tf32her2k_rank_2k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684syrk_objs.dir/generated/rank_k/90/z1684syrk/cutlass_tensorop_z1684syrk_128x64x8_1x1x1_3_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32her2k_objs.dir/generated/rank_2k/80/c1688tf32her2k/cutlass_tensorop_c1688tf32her2k_128x64_16x4_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688her2k_objs.dir/generated/rank_2k/80/c1688her2k/cutlass_tensorop_c1688her2k_128x64_16x4_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_k_sm90_z1684syrk_objs.dir/generated/rank_k/90/z1684syrk/cutlass_tensorop_z1684syrk_128x64x8_1x1x1_3_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688syr2k_objs.dir/generated/rank_2k/80/c1688syr2k/cutlass_tensorop_c1688syr2k_128x64_16x4_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32her2k_objs.dir/generated/rank_2k/80/c1688tf32her2k/cutlass_tensorop_c1688tf32her2k_128x64_16x4_n_u_align1.cu.o [ 60%] Built target cutlass_library_rank_k_sm90_z1684syrk_objs [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs.dir/generated/rank_2k/80/c1688tf32syr2k/all_sm80_c1688tf32syr2k_rank_2k_operations.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs.dir/generated/rank_2k/80/c1688tf32syr2k/cutlass_tensorop_c1688tf32syr2k_128x64_16x4_n_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688her2k_objs.dir/generated/rank_2k/80/c1688her2k/cutlass_tensorop_c1688her2k_128x64_16x4_h_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688syr2k_objs.dir/generated/rank_2k/80/c1688syr2k/cutlass_tensorop_c1688syr2k_128x64_16x4_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32her2k_objs.dir/generated/rank_2k/80/c1688tf32her2k/cutlass_tensorop_c1688tf32her2k_128x64_16x4_h_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs.dir/generated/rank_2k/80/c1688tf32syr2k/cutlass_tensorop_c1688tf32syr2k_128x64_16x4_n_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688syr2k_objs.dir/generated/rank_2k/80/c1688syr2k/cutlass_tensorop_c1688syr2k_128x64_16x4_t_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688her2k_objs.dir/generated/rank_2k/80/c1688her2k/cutlass_tensorop_c1688her2k_128x64_16x4_h_u_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs.dir/generated/rank_2k/80/c1688tf32syr2k/cutlass_tensorop_c1688tf32syr2k_128x64_16x4_t_l_align1.cu.o [ 60%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32her2k_objs.dir/generated/rank_2k/80/c1688tf32her2k/cutlass_tensorop_c1688tf32her2k_128x64_16x4_h_u_align1.cu.o [ 60%] Built target cutlass_library_rank_2k_sm80_c1688syr2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_d884syr2k_objs.dir/generated/rank_2k/80/d884syr2k/all_sm80_d884syr2k_rank_2k_operations.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs.dir/generated/rank_2k/80/c1688tf32syr2k/cutlass_tensorop_c1688tf32syr2k_128x64_16x4_t_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_d884syr2k_objs.dir/generated/rank_2k/80/d884syr2k/cutlass_tensorop_d884syr2k_128x128_16x3_n_l_align1.cu.o [ 61%] Built target cutlass_library_rank_2k_sm80_c1688tf32her2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884her2k_objs.dir/generated/rank_2k/80/gz884her2k/all_sm80_gz884her2k_rank_2k_operations.cu.o [ 61%] Built target cutlass_library_rank_2k_sm80_c1688her2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884syr2k_objs.dir/generated/rank_2k/80/gz884syr2k/all_sm80_gz884syr2k_rank_2k_operations.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884her2k_objs.dir/generated/rank_2k/80/gz884her2k/cutlass_tensorop_gz884her2k_64x64_8x3_n_l_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884syr2k_objs.dir/generated/rank_2k/80/gz884syr2k/cutlass_tensorop_gz884syr2k_64x64_8x3_n_l_align1.cu.o [ 61%] Built target cutlass_library_rank_2k_sm80_c1688tf32syr2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688syr2k_objs.dir/generated/rank_2k/80/s1688syr2k/all_sm80_s1688syr2k_rank_2k_operations.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_d884syr2k_objs.dir/generated/rank_2k/80/d884syr2k/cutlass_tensorop_d884syr2k_128x128_16x3_n_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688syr2k_objs.dir/generated/rank_2k/80/s1688syr2k/cutlass_tensorop_s1688syr2k_256x128_16x3_n_l_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884syr2k_objs.dir/generated/rank_2k/80/gz884syr2k/cutlass_tensorop_gz884syr2k_64x64_8x3_n_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884her2k_objs.dir/generated/rank_2k/80/gz884her2k/cutlass_tensorop_gz884her2k_64x64_8x3_n_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884syr2k_objs.dir/generated/rank_2k/80/gz884syr2k/cutlass_tensorop_gz884syr2k_64x64_8x3_t_l_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884her2k_objs.dir/generated/rank_2k/80/gz884her2k/cutlass_tensorop_gz884her2k_64x64_8x3_h_l_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_d884syr2k_objs.dir/generated/rank_2k/80/d884syr2k/cutlass_tensorop_d884syr2k_128x128_16x3_t_l_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884syr2k_objs.dir/generated/rank_2k/80/gz884syr2k/cutlass_tensorop_gz884syr2k_64x64_8x3_t_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688syr2k_objs.dir/generated/rank_2k/80/s1688syr2k/cutlass_tensorop_s1688syr2k_256x128_16x3_n_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_gz884her2k_objs.dir/generated/rank_2k/80/gz884her2k/cutlass_tensorop_gz884her2k_64x64_8x3_h_u_align1.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_d884syr2k_objs.dir/generated/rank_2k/80/d884syr2k/cutlass_tensorop_d884syr2k_128x128_16x3_t_u_align1.cu.o [ 61%] Built target cutlass_library_rank_2k_sm80_gz884syr2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs.dir/generated/rank_2k/80/s1688tf32syr2k/all_sm80_s1688tf32syr2k_rank_2k_operations.cu.o [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs.dir/generated/rank_2k/80/s1688tf32syr2k/cutlass_tensorop_s1688tf32syr2k_256x128_16x3_n_l_align1.cu.o [ 61%] Built target cutlass_library_rank_2k_sm80_gz884her2k_objs [ 61%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884her2k_objs.dir/generated/rank_2k/80/z884her2k/all_sm80_z884her2k_rank_2k_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884her2k_objs.dir/generated/rank_2k/80/z884her2k/cutlass_tensorop_z884her2k_128x64_8x3_n_l_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm80_d884syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884syr2k_objs.dir/generated/rank_2k/80/z884syr2k/all_sm80_z884syr2k_rank_2k_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688syr2k_objs.dir/generated/rank_2k/80/s1688syr2k/cutlass_tensorop_s1688syr2k_256x128_16x3_t_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884syr2k_objs.dir/generated/rank_2k/80/z884syr2k/cutlass_tensorop_z884syr2k_128x64_8x3_n_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884her2k_objs.dir/generated/rank_2k/80/z884her2k/cutlass_tensorop_z884her2k_128x64_8x3_n_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs.dir/generated/rank_2k/80/s1688tf32syr2k/cutlass_tensorop_s1688tf32syr2k_256x128_16x3_n_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884syr2k_objs.dir/generated/rank_2k/80/z884syr2k/cutlass_tensorop_z884syr2k_128x64_8x3_n_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688syr2k_objs.dir/generated/rank_2k/80/s1688syr2k/cutlass_tensorop_s1688syr2k_256x128_16x3_t_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884her2k_objs.dir/generated/rank_2k/80/z884her2k/cutlass_tensorop_z884her2k_128x64_8x3_h_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884syr2k_objs.dir/generated/rank_2k/80/z884syr2k/cutlass_tensorop_z884syr2k_128x64_8x3_t_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs.dir/generated/rank_2k/80/s1688tf32syr2k/cutlass_tensorop_s1688tf32syr2k_256x128_16x3_t_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884syr2k_objs.dir/generated/rank_2k/80/z884syr2k/cutlass_tensorop_z884syr2k_128x64_8x3_t_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_z884her2k_objs.dir/generated/rank_2k/80/z884her2k/cutlass_tensorop_z884her2k_128x64_8x3_h_u_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm80_s1688syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_d1684syr2k_objs.dir/generated/rank_2k/90/d1684syr2k/all_sm90_d1684syr2k_rank_2k_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_d1684syr2k_objs.dir/generated/rank_2k/90/d1684syr2k/cutlass_tensorop_d1684syr2k_128x128x16_1x1x1_3_n_l_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm80_z884syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684her2k_objs.dir/generated/rank_2k/90/gz1684her2k/all_sm90_gz1684her2k_rank_2k_operations.cu.o [ 62%] Built target cutlass_library_rank_2k_sm80_z884her2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684syr2k_objs.dir/generated/rank_2k/90/gz1684syr2k/all_sm90_gz1684syr2k_rank_2k_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs.dir/generated/rank_2k/80/s1688tf32syr2k/cutlass_tensorop_s1688tf32syr2k_256x128_16x3_t_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684her2k_objs.dir/generated/rank_2k/90/gz1684her2k/cutlass_tensorop_gz1684her2k_64x64x8_1x1x1_3_n_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684syr2k_objs.dir/generated/rank_2k/90/gz1684syr2k/cutlass_tensorop_gz1684syr2k_64x64x8_1x1x1_3_n_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_d1684syr2k_objs.dir/generated/rank_2k/90/d1684syr2k/cutlass_tensorop_d1684syr2k_128x128x16_1x1x1_3_n_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684her2k_objs.dir/generated/rank_2k/90/gz1684her2k/cutlass_tensorop_gz1684her2k_64x64x8_1x1x1_3_n_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684syr2k_objs.dir/generated/rank_2k/90/gz1684syr2k/cutlass_tensorop_gz1684syr2k_64x64x8_1x1x1_3_n_u_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm80_s1688tf32syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684her2k_objs.dir/generated/rank_2k/90/z1684her2k/all_sm90_z1684her2k_rank_2k_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_d1684syr2k_objs.dir/generated/rank_2k/90/d1684syr2k/cutlass_tensorop_d1684syr2k_128x128x16_1x1x1_3_t_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684her2k_objs.dir/generated/rank_2k/90/gz1684her2k/cutlass_tensorop_gz1684her2k_64x64x8_1x1x1_3_h_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684syr2k_objs.dir/generated/rank_2k/90/gz1684syr2k/cutlass_tensorop_gz1684syr2k_64x64x8_1x1x1_3_t_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684her2k_objs.dir/generated/rank_2k/90/z1684her2k/cutlass_tensorop_z1684her2k_128x64x8_1x1x1_3_n_l_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684syr2k_objs.dir/generated/rank_2k/90/gz1684syr2k/cutlass_tensorop_gz1684syr2k_64x64x8_1x1x1_3_t_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_gz1684her2k_objs.dir/generated/rank_2k/90/gz1684her2k/cutlass_tensorop_gz1684her2k_64x64x8_1x1x1_3_h_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_d1684syr2k_objs.dir/generated/rank_2k/90/d1684syr2k/cutlass_tensorop_d1684syr2k_128x128x16_1x1x1_3_t_u_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684her2k_objs.dir/generated/rank_2k/90/z1684her2k/cutlass_tensorop_z1684her2k_128x64x8_1x1x1_3_n_u_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm90_gz1684syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684syr2k_objs.dir/generated/rank_2k/90/z1684syr2k/all_sm90_z1684syr2k_rank_2k_operations.cu.o [ 62%] Built target cutlass_library_rank_2k_sm90_gz1684her2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/all_sm80_c1688tf32trmm_trmm_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684syr2k_objs.dir/generated/rank_2k/90/z1684syr2k/cutlass_tensorop_z1684syr2k_128x64x8_1x1x1_3_n_l_align1.cu.o [ 62%] Built target cutlass_library_rank_2k_sm90_d1684syr2k_objs [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/all_sm80_c1688trmm_trmm_operations.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_ls_l_nu_align1.cu.o [ 62%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_ls_l_nu_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684her2k_objs.dir/generated/rank_2k/90/z1684her2k/cutlass_tensorop_z1684her2k_128x64x8_1x1x1_3_h_l_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684syr2k_objs.dir/generated/rank_2k/90/z1684syr2k/cutlass_tensorop_z1684syr2k_128x64x8_1x1x1_3_n_u_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_ls_l_nu_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_ls_l_nu_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_ls_l_un_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684her2k_objs.dir/generated/rank_2k/90/z1684her2k/cutlass_tensorop_z1684her2k_128x64x8_1x1x1_3_h_u_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684syr2k_objs.dir/generated/rank_2k/90/z1684syr2k/cutlass_tensorop_z1684syr2k_128x64x8_1x1x1_3_t_l_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_ls_l_un_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_ls_l_un_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_rank_2k_sm90_z1684syr2k_objs.dir/generated/rank_2k/90/z1684syr2k/cutlass_tensorop_z1684syr2k_128x64x8_1x1x1_3_t_u_align1.cu.o [ 63%] Built target cutlass_library_rank_2k_sm90_z1684her2k_objs [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/all_sm80_d884trmm_trmm_operations.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_ls_l_un_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_ls_l_nu_align1.cu.o [ 63%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_ls_u_nu_align1.cu.o [ 63%] Built target cutlass_library_rank_2k_sm90_z1684syr2k_objs [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/all_sm80_gz884trmm_trmm_operations.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_ls_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_ls_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_ls_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_ls_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_ls_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_rs_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_nn_rs_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_ls_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_ls_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_rs_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_ls_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_rs_l_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_rs_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_rs_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_ls_u_nu_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_nn_rs_u_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_rs_l_un_align1.cu.o [ 64%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_ls_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_cn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_rs_u_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_rs_u_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_rs_l_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_ls_l_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_nn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_rs_u_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_rs_l_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_ls_l_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_cn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_nn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_rs_u_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_ls_l_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_cn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_ls_l_nu_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_d884trmm_objs.dir/generated/trmm/80/d884trmm/cutlass_tensorop_d884trmm_128x128_16x3_tn_rs_u_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_ls_l_un_align1.cu.o [ 65%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_ls_l_nu_align1.cu.o [ 66%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_ls_l_nu_align1.cu.o [ 66%] Built target cutlass_library_trmm_sm80_d884trmm_objs [ 66%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/all_sm80_s1688tf32trmm_trmm_operations.cu.o [ 66%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_ls_l_nu_align1.cu.o [ 66%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_ls_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_nn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_ls_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_tn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688tf32trmm_objs.dir/generated/trmm/80/c1688tf32trmm/cutlass_tensorop_c1688tf32trmm_128x64_16x4_hn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_tn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_rs_u_nu_align1.cu.o [ 67%] Built target cutlass_library_trmm_sm80_c1688tf32trmm_objs [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/all_sm80_s1688trmm_trmm_operations.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_gz884trmm_objs.dir/generated/trmm/80/gz884trmm/cutlass_tensorop_gz884trmm_64x64_8x3_hn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_ls_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_ls_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_rs_u_nu_align1.cu.o [ 67%] Built target cutlass_library_trmm_sm80_gz884trmm_objs [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/all_sm80_z884trmm_trmm_operations.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_ls_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_rs_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_tn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_ls_l_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_rs_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_ls_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_c1688trmm_objs.dir/generated/trmm/80/c1688trmm/cutlass_tensorop_c1688trmm_128x64_16x4_hn_rs_u_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_ls_l_un_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_rs_u_nu_align1.cu.o [ 67%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_ls_u_un_align1.cu.o [ 67%] Built target cutlass_library_trmm_sm80_c1688trmm_objs [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/all_sm90_d1684trmm_trmm_operations.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688tf32trmm_objs.dir/generated/trmm/80/s1688tf32trmm/cutlass_tensorop_s1688tf32trmm_256x128_16x3_tn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_ls_u_nu_align1.cu.o [ 68%] Built target cutlass_library_trmm_sm80_s1688tf32trmm_objs [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/all_sm90_gz1684trmm_trmm_operations.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_nn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_nn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_ls_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_nn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_ls_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_cn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_rs_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_rs_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_s1688trmm_objs.dir/generated/trmm/80/s1688trmm/cutlass_tensorop_s1688trmm_256x128_16x3_tn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_ls_l_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_rs_u_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_nn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_ls_l_un_align1.cu.o [ 68%] Built target cutlass_library_trmm_sm80_s1688trmm_objs [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/all_sm90_z1684trmm_trmm_operations.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_d1684trmm_objs.dir/generated/trmm/90/d1684trmm/cutlass_tensorop_d1684trmm_128x128x16_1x1x1_3_tn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_ls_l_nu_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_cn_rs_u_un_align1.cu.o [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_ls_u_nu_align1.cu.o [ 68%] Built target cutlass_library_trmm_sm90_d1684trmm_objs [ 68%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688hemm_objs.dir/generated/symm/80/c1688hemm/all_sm80_c1688hemm_symm_operations.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_ls_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_ls_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688hemm_objs.dir/generated/symm/80/c1688hemm/cutlass_tensorop_c1688hemm_128x64_16x4_n_ls_l_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_ls_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_ls_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_ls_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_ls_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688hemm_objs.dir/generated/symm/80/c1688hemm/cutlass_tensorop_c1688hemm_128x64_16x4_n_ls_u_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_ls_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_ls_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_ls_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_rs_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688hemm_objs.dir/generated/symm/80/c1688hemm/cutlass_tensorop_c1688hemm_128x64_16x4_n_rs_l_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_ls_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_ls_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_rs_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_ls_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688hemm_objs.dir/generated/symm/80/c1688hemm/cutlass_tensorop_c1688hemm_128x64_16x4_n_rs_u_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_ls_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_rs_l_nu_align1.cu.o [ 69%] Built target cutlass_library_symm_sm80_c1688hemm_objs [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688symm_objs.dir/generated/symm/80/c1688symm/all_sm80_c1688symm_symm_operations.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_rs_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_rs_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688symm_objs.dir/generated/symm/80/c1688symm/cutlass_tensorop_c1688symm_128x64_16x4_n_ls_l_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_rs_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_rs_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_rs_l_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_tn_rs_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688symm_objs.dir/generated/symm/80/c1688symm/cutlass_tensorop_c1688symm_128x64_16x4_n_ls_u_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_rs_l_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm80_z884trmm_objs.dir/generated/trmm/80/z884trmm/cutlass_tensorop_z884trmm_128x64_8x3_hn_rs_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_rs_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688symm_objs.dir/generated/symm/80/c1688symm/cutlass_tensorop_c1688symm_128x64_16x4_n_rs_l_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_rs_u_nu_align1.cu.o [ 69%] Built target cutlass_library_trmm_sm80_z884trmm_objs [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_rs_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_rs_u_nu_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_tn_rs_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_gz1684trmm_objs.dir/generated/trmm/90/gz1684trmm/cutlass_tensorop_gz1684trmm_64x64x8_1x1x1_3_hn_rs_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_nn_rs_u_un_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688symm_objs.dir/generated/symm/80/c1688symm/cutlass_tensorop_c1688symm_128x64_16x4_n_rs_u_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_cn_rs_u_un_align1.cu.o [ 69%] Built target cutlass_library_trmm_sm90_gz1684trmm_objs [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32hemm_objs.dir/generated/symm/80/c1688tf32hemm/all_sm80_c1688tf32hemm_symm_operations.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32hemm_objs.dir/generated/symm/80/c1688tf32hemm/cutlass_tensorop_c1688tf32hemm_128x64_16x4_n_ls_l_align1.cu.o [ 69%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_ls_l_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_ls_l_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_ls_l_un_align1.cu.o [ 70%] Built target cutlass_library_symm_sm80_c1688symm_objs [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32symm_objs.dir/generated/symm/80/c1688tf32symm/all_sm80_c1688tf32symm_symm_operations.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32hemm_objs.dir/generated/symm/80/c1688tf32hemm/cutlass_tensorop_c1688tf32hemm_128x64_16x4_n_ls_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_ls_l_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32symm_objs.dir/generated/symm/80/c1688tf32symm/cutlass_tensorop_c1688tf32symm_128x64_16x4_n_ls_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_ls_u_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_ls_u_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32hemm_objs.dir/generated/symm/80/c1688tf32hemm/cutlass_tensorop_c1688tf32hemm_128x64_16x4_n_rs_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32symm_objs.dir/generated/symm/80/c1688tf32symm/cutlass_tensorop_c1688tf32symm_128x64_16x4_n_ls_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_ls_u_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_ls_u_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_rs_l_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32symm_objs.dir/generated/symm/80/c1688tf32symm/cutlass_tensorop_c1688tf32symm_128x64_16x4_n_rs_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32hemm_objs.dir/generated/symm/80/c1688tf32hemm/cutlass_tensorop_c1688tf32hemm_128x64_16x4_n_rs_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_rs_l_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_rs_l_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_rs_l_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_c1688tf32symm_objs.dir/generated/symm/80/c1688tf32symm/cutlass_tensorop_c1688tf32symm_128x64_16x4_n_rs_u_align1.cu.o [ 70%] Built target cutlass_library_symm_sm80_c1688tf32hemm_objs [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_d884symm_objs.dir/generated/symm/80/d884symm/all_sm80_d884symm_symm_operations.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_rs_u_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_d884symm_objs.dir/generated/symm/80/d884symm/cutlass_tensorop_d884symm_128x128_16x3_n_ls_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_rs_u_nu_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_tn_rs_u_un_align1.cu.o [ 70%] Built target cutlass_library_symm_sm80_c1688tf32symm_objs [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884hemm_objs.dir/generated/symm/80/gz884hemm/all_sm80_gz884hemm_symm_operations.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_trmm_sm90_z1684trmm_objs.dir/generated/trmm/90/z1684trmm/cutlass_tensorop_z1684trmm_128x64x8_1x1x1_3_hn_rs_u_un_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884hemm_objs.dir/generated/symm/80/gz884hemm/cutlass_tensorop_gz884hemm_64x64_8x3_n_ls_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_d884symm_objs.dir/generated/symm/80/d884symm/cutlass_tensorop_d884symm_128x128_16x3_n_ls_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_d884symm_objs.dir/generated/symm/80/d884symm/cutlass_tensorop_d884symm_128x128_16x3_n_rs_l_align1.cu.o [ 70%] Built target cutlass_library_trmm_sm90_z1684trmm_objs [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884symm_objs.dir/generated/symm/80/gz884symm/all_sm80_gz884symm_symm_operations.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884hemm_objs.dir/generated/symm/80/gz884hemm/cutlass_tensorop_gz884hemm_64x64_8x3_n_ls_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884symm_objs.dir/generated/symm/80/gz884symm/cutlass_tensorop_gz884symm_64x64_8x3_n_ls_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_d884symm_objs.dir/generated/symm/80/d884symm/cutlass_tensorop_d884symm_128x128_16x3_n_rs_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884symm_objs.dir/generated/symm/80/gz884symm/cutlass_tensorop_gz884symm_64x64_8x3_n_ls_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884hemm_objs.dir/generated/symm/80/gz884hemm/cutlass_tensorop_gz884hemm_64x64_8x3_n_rs_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884symm_objs.dir/generated/symm/80/gz884symm/cutlass_tensorop_gz884symm_64x64_8x3_n_rs_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884symm_objs.dir/generated/symm/80/gz884symm/cutlass_tensorop_gz884symm_64x64_8x3_n_rs_u_align1.cu.o [ 70%] Built target cutlass_library_symm_sm80_d884symm_objs [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688symm_objs.dir/generated/symm/80/s1688symm/all_sm80_s1688symm_symm_operations.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688symm_objs.dir/generated/symm/80/s1688symm/cutlass_tensorop_s1688symm_256x128_16x3_n_ls_l_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_gz884hemm_objs.dir/generated/symm/80/gz884hemm/cutlass_tensorop_gz884hemm_64x64_8x3_n_rs_u_align1.cu.o [ 70%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688symm_objs.dir/generated/symm/80/s1688symm/cutlass_tensorop_s1688symm_256x128_16x3_n_ls_u_align1.cu.o [ 70%] Built target cutlass_library_symm_sm80_gz884symm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688tf32symm_objs.dir/generated/symm/80/s1688tf32symm/all_sm80_s1688tf32symm_symm_operations.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688tf32symm_objs.dir/generated/symm/80/s1688tf32symm/cutlass_tensorop_s1688tf32symm_256x128_16x3_n_ls_l_align1.cu.o [ 71%] Built target cutlass_library_symm_sm80_gz884hemm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884hemm_objs.dir/generated/symm/80/z884hemm/all_sm80_z884hemm_symm_operations.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884hemm_objs.dir/generated/symm/80/z884hemm/cutlass_tensorop_z884hemm_128x64_8x3_n_ls_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688symm_objs.dir/generated/symm/80/s1688symm/cutlass_tensorop_s1688symm_256x128_16x3_n_rs_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688symm_objs.dir/generated/symm/80/s1688symm/cutlass_tensorop_s1688symm_256x128_16x3_n_rs_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688tf32symm_objs.dir/generated/symm/80/s1688tf32symm/cutlass_tensorop_s1688tf32symm_256x128_16x3_n_ls_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884hemm_objs.dir/generated/symm/80/z884hemm/cutlass_tensorop_z884hemm_128x64_8x3_n_ls_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884hemm_objs.dir/generated/symm/80/z884hemm/cutlass_tensorop_z884hemm_128x64_8x3_n_rs_l_align1.cu.o [ 71%] Built target cutlass_library_symm_sm80_s1688symm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884symm_objs.dir/generated/symm/80/z884symm/all_sm80_z884symm_symm_operations.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884hemm_objs.dir/generated/symm/80/z884hemm/cutlass_tensorop_z884hemm_128x64_8x3_n_rs_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884symm_objs.dir/generated/symm/80/z884symm/cutlass_tensorop_z884symm_128x64_8x3_n_ls_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688tf32symm_objs.dir/generated/symm/80/s1688tf32symm/cutlass_tensorop_s1688tf32symm_256x128_16x3_n_rs_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_s1688tf32symm_objs.dir/generated/symm/80/s1688tf32symm/cutlass_tensorop_s1688tf32symm_256x128_16x3_n_rs_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884symm_objs.dir/generated/symm/80/z884symm/cutlass_tensorop_z884symm_128x64_8x3_n_ls_u_align1.cu.o [ 71%] Built target cutlass_library_symm_sm80_z884hemm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_d1684symm_objs.dir/generated/symm/90/d1684symm/all_sm90_d1684symm_symm_operations.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_d1684symm_objs.dir/generated/symm/90/d1684symm/cutlass_tensorop_d1684symm_128x128x16_1x1x1_3_n_ls_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_d1684symm_objs.dir/generated/symm/90/d1684symm/cutlass_tensorop_d1684symm_128x128x16_1x1x1_3_n_ls_u_align1.cu.o [ 71%] Built target cutlass_library_symm_sm80_s1688tf32symm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684hemm_objs.dir/generated/symm/90/gz1684hemm/all_sm90_gz1684hemm_symm_operations.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884symm_objs.dir/generated/symm/80/z884symm/cutlass_tensorop_z884symm_128x64_8x3_n_rs_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684hemm_objs.dir/generated/symm/90/gz1684hemm/cutlass_tensorop_gz1684hemm_64x64x8_1x1x1_3_n_ls_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_d1684symm_objs.dir/generated/symm/90/d1684symm/cutlass_tensorop_d1684symm_128x128x16_1x1x1_3_n_rs_l_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_d1684symm_objs.dir/generated/symm/90/d1684symm/cutlass_tensorop_d1684symm_128x128x16_1x1x1_3_n_rs_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm80_z884symm_objs.dir/generated/symm/80/z884symm/cutlass_tensorop_z884symm_128x64_8x3_n_rs_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684hemm_objs.dir/generated/symm/90/gz1684hemm/cutlass_tensorop_gz1684hemm_64x64x8_1x1x1_3_n_ls_u_align1.cu.o [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684hemm_objs.dir/generated/symm/90/gz1684hemm/cutlass_tensorop_gz1684hemm_64x64x8_1x1x1_3_n_rs_l_align1.cu.o [ 71%] Built target cutlass_library_symm_sm90_d1684symm_objs [ 71%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684symm_objs.dir/generated/symm/90/gz1684symm/all_sm90_gz1684symm_symm_operations.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684symm_objs.dir/generated/symm/90/gz1684symm/cutlass_tensorop_gz1684symm_64x64x8_1x1x1_3_n_ls_l_align1.cu.o [ 72%] Built target cutlass_library_symm_sm80_z884symm_objs [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684hemm_objs.dir/generated/symm/90/z1684hemm/all_sm90_z1684hemm_symm_operations.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684hemm_objs.dir/generated/symm/90/gz1684hemm/cutlass_tensorop_gz1684hemm_64x64x8_1x1x1_3_n_rs_u_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684hemm_objs.dir/generated/symm/90/z1684hemm/cutlass_tensorop_z1684hemm_128x64x8_1x1x1_3_n_ls_l_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684hemm_objs.dir/generated/symm/90/z1684hemm/cutlass_tensorop_z1684hemm_128x64x8_1x1x1_3_n_ls_u_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684symm_objs.dir/generated/symm/90/gz1684symm/cutlass_tensorop_gz1684symm_64x64x8_1x1x1_3_n_ls_u_align1.cu.o [ 72%] Built target cutlass_library_symm_sm90_gz1684hemm_objs [ 72%] Linking CUDA static library libcutlass_symm_sm90_z1684symm.a [ 72%] Built target cutlass_library_symm_sm90_z1684symm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm50_cgemm.a [ 72%] Built target cutlass_library_gemm_sm50_cgemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm50_dgemm.a [ 72%] Built target cutlass_library_gemm_sm50_dgemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm50_sgemm.a [ 72%] Built target cutlass_library_gemm_sm50_sgemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm60_hgemm.a [ 72%] Built target cutlass_library_gemm_sm60_hgemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm61_igemm_s8.a [ 72%] Built target cutlass_library_gemm_sm61_igemm_s8_static [ 72%] Linking CUDA static library libcutlass_gemm_sm61_s8_igemm_s8.a [ 72%] Built target cutlass_library_gemm_sm61_s8_igemm_s8_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_f16_s884gemm_f16.a [ 72%] Built target cutlass_library_gemm_sm70_f16_s884gemm_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.a [ 72%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.a [ 72%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_h884gemm.a [ 72%] Built target cutlass_library_gemm_sm70_h884gemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_h884gemm_planar_complex.a [ 72%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_h884gemm_planar_complex_array.a [ 72%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex_array_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_s884gemm_f16.a [ 72%] Built target cutlass_library_gemm_sm70_s884gemm_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.a [ 72%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm70_s884gemm_planar_complex_f16.a [ 72%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_f16_static [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684symm_objs.dir/generated/symm/90/gz1684symm/cutlass_tensorop_gz1684symm_64x64x8_1x1x1_3_n_rs_l_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684hemm_objs.dir/generated/symm/90/z1684hemm/cutlass_tensorop_z1684hemm_128x64x8_1x1x1_3_n_rs_l_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_z1684hemm_objs.dir/generated/symm/90/z1684hemm/cutlass_tensorop_z1684hemm_128x64x8_1x1x1_3_n_rs_u_align1.cu.o [ 72%] Building CUDA object tools/library/CMakeFiles/cutlass_library_symm_sm90_gz1684symm_objs.dir/generated/symm/90/gz1684symm/cutlass_tensorop_gz1684symm_64x64x8_1x1x1_3_n_rs_u_align1.cu.o [ 72%] Linking CUDA static library libcutlass_gemm_sm75_f16_s1688gemm_f16.a [ 72%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.a [ 72%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.a [ 72%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_h1688gemm.a [ 72%] Built target cutlass_library_gemm_sm75_h1688gemm_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_h1688gemm_planar_complex.a [ 72%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_h1688gemm_planar_complex_array.a [ 72%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex_array_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_i88128xorgemm_b1.a [ 72%] Built target cutlass_library_gemm_sm75_i88128xorgemm_b1_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_i8816gemm_s8.a [ 72%] Built target cutlass_library_gemm_sm75_i8816gemm_s8_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_i8816gemm_u8.a [ 72%] Built target cutlass_library_gemm_sm75_i8816gemm_u8_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_i8832gemm_s4.a [ 72%] Built target cutlass_library_gemm_sm75_i8832gemm_s4_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_i8832gemm_u4.a [ 72%] Built target cutlass_library_gemm_sm75_i8832gemm_u4_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_s1688gemm_f16.a [ 72%] Built target cutlass_library_gemm_sm75_s1688gemm_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.a [ 72%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16_static [ 72%] Linking CUDA static library libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.a [ 72%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16_static [ 73%] Linking CUDA static library libcutlass_gemm_sm75_s4_i8832gemm_s4.a [ 73%] Built target cutlass_library_gemm_sm75_s4_i8832gemm_s4_static [ 73%] Linking CUDA static library libcutlass_gemm_sm75_s8_i8816gemm_s8.a [ 73%] Built target cutlass_library_gemm_sm75_s8_i8816gemm_s8_static [ 73%] Linking CUDA static library libcutlass_gemm_sm75_u4_i8832gemm_u4.a [ 73%] Built target cutlass_library_gemm_sm75_u4_i8832gemm_u4_static [ 73%] Linking CUDA static library libcutlass_gemm_sm75_u8_i8816gemm_u8.a [ 73%] Built target cutlass_library_gemm_sm75_u8_i8816gemm_u8_static [ 73%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_bf16.a [ 73%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_static [ 73%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.a [ 73%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_s8_static [ 73%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.a [ 73%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_u8_static [ 73%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.a [ 73%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16_static [ 73%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.a [ 73%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.a [ 74%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_s8_bf16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.a [ 74%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_u8_bf16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.a [ 74%] Built target cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_c1688gemm.a [ 74%] Built target cutlass_library_gemm_sm80_c1688gemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_c1688tf32gemm.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_cgemm.a [ 74%] Built target cutlass_library_gemm_sm80_c1688tf32gemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_d884gemm.a [ 74%] Built target cutlass_library_gemm_sm80_d884gemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_dgemm.a [ 74%] Built target cutlass_library_gemm_sm80_cgemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_f16.a [ 74%] Built target cutlass_library_gemm_sm80_dgemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_s8_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_u8_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_s8_f16_static [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_f16_s16832spgemm_f16.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_u8_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_gz884gemm.a [ 74%] Built target cutlass_library_gemm_sm80_f16_s16832spgemm_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16816gemm.a [ 74%] Built target cutlass_library_gemm_sm80_h16816gemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16816gemm_grouped.a [ 74%] Built target cutlass_library_gemm_sm80_gz884gemm_static [ 74%] Built target cutlass_library_gemm_sm80_h16816gemm_grouped_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16816gemm_planar_complex.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16816gemm_planar_complex_array.a [ 74%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex_static [ 74%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex_array_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16816gemm_s8_f16.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_h16832spgemm.a [ 74%] Built target cutlass_library_gemm_sm80_h16816gemm_s8_f16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i168128spgemm_s4.a [ 74%] Built target cutlass_library_gemm_sm80_i168128spgemm_s4_static [ 74%] Built target cutlass_library_gemm_sm80_h16832spgemm_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i168256andgemm_b1.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i168256xorgemm_b1.a [ 74%] Built target cutlass_library_gemm_sm80_i168256andgemm_b1_static [ 74%] Built target cutlass_library_gemm_sm80_i168256xorgemm_b1_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i16832gemm_s8.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i16832gemm_u8.a [ 74%] Built target cutlass_library_gemm_sm80_i16832gemm_s8_static [ 74%] Built target cutlass_library_gemm_sm80_i16832gemm_u8_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i16864gemm_s4.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i16864gemm_u4.a [ 74%] Built target cutlass_library_gemm_sm80_i16864gemm_u4_static [ 74%] Built target cutlass_library_gemm_sm80_i16864gemm_s4_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_i16864spgemm_s8.a [ 74%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_bf16.a [ 74%] Built target cutlass_library_gemm_sm80_i16864spgemm_s8_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_bf16_s8.a [ 74%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_bf16_u8.a [ 74%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_s8_static [ 74%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_f16.a [ 74%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_u8_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_f16_s8.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_s8_static [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_f16_u8.a [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_grouped_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_u8_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_grouped_f16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_f16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_s8_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16_static [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_s8_f16.a [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_u8_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_f16_static [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816gemm_u8_f16.a [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16816tf32spgemm.a [ 75%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_f16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16832spgemm_bf16.a [ 75%] Built target cutlass_library_gemm_sm80_s16816tf32spgemm_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s16832spgemm_f16.a [ 75%] Built target cutlass_library_gemm_sm80_s16832spgemm_bf16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s1688bf16gemm.a [ 75%] Built target cutlass_library_gemm_sm80_s16832spgemm_f16_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s1688f16gemm.a [ 75%] Built target cutlass_library_symm_sm90_z1684hemm_objs [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s1688gemm.a [ 75%] Built target cutlass_library_gemm_sm80_s1688bf16gemm_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s1688gemm_tf32.a [ 75%] Built target cutlass_library_gemm_sm80_s1688f16gemm_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s1688tf32gemm.a [ 75%] Built target cutlass_library_gemm_sm80_s1688gemm_static [ 75%] Built target cutlass_library_gemm_sm80_s1688gemm_tf32_static [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s4_i168128spgemm_s4.a [ 75%] Linking CUDA static library libcutlass_gemm_sm80_s4_i16864gemm_s4.a [ 75%] Built target cutlass_library_gemm_sm80_s4_i168128spgemm_s4_static [ 76%] Linking CUDA static library libcutlass_gemm_sm80_s8_i16832gemm_s8.a [ 76%] Built target cutlass_library_gemm_sm80_s1688tf32gemm_static [ 76%] Built target cutlass_library_gemm_sm80_s4_i16864gemm_s4_static [ 76%] Linking CUDA static library libcutlass_gemm_sm80_s8_i16864spgemm_s8.a [ 76%] Linking CUDA static library libcutlass_gemm_sm80_sgemm.a [ 76%] Built target cutlass_library_gemm_sm80_s8_i16864spgemm_s8_static [ 76%] Built target cutlass_library_gemm_sm80_s8_i16832gemm_s8_static [ 76%] Linking CUDA static library libcutlass_gemm_sm80_tf32_s1688gemm_tf32.a [ 76%] Linking CUDA static library libcutlass_gemm_sm80_u4_i16864gemm_u4.a [ 76%] Built target cutlass_library_gemm_sm80_u4_i16864gemm_u4_static [ 76%] Linking CUDA static library libcutlass_gemm_sm80_u8_i16832gemm_u8.a [ 76%] Built target cutlass_library_gemm_sm80_sgemm_static [ 76%] Linking CUDA static library libcutlass_gemm_sm80_z884gemm.a [ 76%] Built target cutlass_library_gemm_sm80_u8_i16832gemm_u8_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.a [ 76%] Built target cutlass_library_gemm_sm80_tf32_s1688gemm_tf32_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.a [ 76%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.a [ 76%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.a [ 76%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832gemm_e4m3.a [ 76%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.a [ 76%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832gemm_e5m2.a [ 76%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3_e5m2_static [ 76%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.a [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.a [ 76%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2_e4m3_static [ 76%] Built target cutlass_library_gemm_sm80_z884gemm_static [ 76%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.a [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.a [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.a [ 76%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2_static [ 76%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864spgemm_e4m3.a [ 76%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3_static [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.a [ 76%] Linking CUDA static library libcutlass_gemm_sm89_s16864spgemm_e5m2.a [ 76%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3_static [ 76%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3_e5m2_static [ 77%] Linking CUDA static library libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.a [ 77%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.a [ 77%] Linking CUDA static library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.a [ 77%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2_e4m3_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.a [ 77%] Built target cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.a [ 77%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.a [ 77%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_d1684gemm.a [ 77%] Built target cutlass_library_gemm_sm90_d1684gemm_static [ 77%] Linking CUDA static library libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.a [ 77%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_static [ 78%] Linking CUDA static library libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.a [ 78%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3_static [ 78%] Linking CUDA static library libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.a [ 78%] Built target cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16_static [ 78%] Linking CUDA static library libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.a [ 78%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_static [ 78%] Linking CUDA static library libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.a [ 78%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_gz1684gemm.a [ 79%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_h64x128x16gemm.a [ 79%] Built target cutlass_library_gemm_sm90_gz1684gemm_static [ 79%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_i64x128x32gemm_s8.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_i64x128x32gemm_u8.a [ 79%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_s8_static [ 79%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_u8_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x16gemm_bf16.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x16gemm_f16.a [ 79%] Built target cutlass_library_gemm_sm90_h64x128x16gemm_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x32gemm_e4m3.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_bf16_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_f16_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x32gemm_e5m2.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x8gemm_tf32.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s64x128x8tf32gemm.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x8gemm_tf32_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.a [ 79%] Built target cutlass_library_gemm_sm90_s64x128x8tf32gemm_static [ 79%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_i64x128x32gemm_s8.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_i64x128x32gemm_u8.a [ 79%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8_static [ 79%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_s8_static [ 79%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_u8_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x16gemm_f16.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.a [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.a [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.a [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_f16_static [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_static [ 79%] Linking CUDA static library libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.a [ 79%] Linking CUDA static library libcutlass_gemm_sm90_z1684gemm.a [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_static [ 79%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.a [ 79%] Built target cutlass_library_conv2d_sm50_cf32_cfprop_optimized_cf32_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.a [ 79%] Built target cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_sdgrad_optimized.a [ 79%] Built target cutlass_library_conv2d_sm50_cf32_cwgrad_optimized_cf32_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_sfprop_optimized.a [ 79%] Built target cutlass_library_gemm_sm90_z1684gemm_static [ 79%] Built target cutlass_library_conv2d_sm50_sdgrad_optimized_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm50_swgrad_optimized.a [ 79%] Built target cutlass_library_conv2d_sm50_sfprop_optimized_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm60_hfprop_optimized.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.a [ 79%] Built target cutlass_library_conv2d_sm50_swgrad_optimized_static [ 79%] Built target cutlass_library_conv2d_sm60_hfprop_optimized_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.a [ 79%] Built target cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16_static [ 79%] Built target cutlass_library_conv2d_sm70_f16_s884fprop_optimized_f16_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_h884dgrad_optimized.a [ 79%] Built target cutlass_library_conv2d_sm70_f16_s884wgrad_optimized_f16_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_h884fprop_optimized.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_h884wgrad_optimized.a [ 79%] Built target cutlass_library_conv2d_sm70_h884fprop_optimized_static [ 79%] Built target cutlass_library_conv2d_sm70_h884wgrad_optimized_static [ 79%] Built target cutlass_library_conv2d_sm70_h884dgrad_optimized_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_s884dgrad_optimized_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_s884fprop_optimized_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm70_s884wgrad_optimized_f16.a [ 79%] Built target cutlass_library_conv2d_sm70_s884fprop_optimized_f16_static [ 79%] Built target cutlass_library_conv2d_sm70_s884dgrad_optimized_f16_static [ 79%] Built target cutlass_library_conv2d_sm70_s884wgrad_optimized_f16_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.a [ 79%] Built target cutlass_library_conv2d_sm75_cf32_cwgrad_optimized_cf32_static [ 79%] Built target cutlass_library_conv2d_sm75_cf32_cfprop_optimized_cf32_static [ 79%] Built target cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.a [ 79%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_fixed_channels_f16_static [ 79%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_few_channels_f16_static [ 79%] Built target cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16_static [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.a [ 79%] Linking CUDA static library libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_h1688dgrad_optimized.a [ 80%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_optimized_f16_static [ 80%] Built target cutlass_library_conv2d_sm75_f16_s1688wgrad_optimized_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_h1688fprop_few_channels.a [ 80%] Built target cutlass_library_conv2d_sm75_h1688dgrad_optimized_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_h1688fprop_fixed_channels.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_h1688fprop_optimized.a [ 80%] Built target cutlass_library_conv2d_sm75_h1688fprop_fixed_channels_static [ 80%] Built target cutlass_library_conv2d_sm75_h1688fprop_few_channels_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_h1688wgrad_optimized.a [ 80%] Built target cutlass_library_conv2d_sm75_h1688fprop_optimized_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_i8816fprop_optimized_s8.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_i8816fprop_optimized_u8.a [ 80%] Built target cutlass_library_conv2d_sm75_h1688wgrad_optimized_static [ 80%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_s8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_i8832fprop_optimized_s4.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_i8832fprop_optimized_u4.a [ 80%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_u8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.a [ 80%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_s4_static [ 80%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_u4_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.a [ 80%] Built target cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16_static [ 80%] Built target cutlass_library_conv2d_sm75_s1688fprop_few_channels_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s1688fprop_optimized_f16.a [ 80%] Built target cutlass_library_conv2d_sm75_s1688fprop_fixed_channels_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.a [ 80%] Built target cutlass_library_conv2d_sm75_s1688fprop_optimized_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.a [ 80%] Built target cutlass_library_conv2d_sm75_s1688wgrad_optimized_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.a [ 80%] Built target cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4_static [ 80%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_few_channels_s8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.a [ 80%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_fixed_channels_s8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.a [ 80%] Built target cutlass_library_symm_sm90_gz1684symm_objs [ 80%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.a [ 80%] Built target cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4_static [ 80%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_few_channels_u8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.a [ 80%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_fixed_channels_u8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.a [ 80%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8_static [ 80%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16_static [ 80%] Built target cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.a [ 80%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16_static [ 80%] Built target cutlass_library_conv2d_sm80_bf16_s16816wgrad_optimized_bf16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.a [ 80%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_fixed_channels_f16_static [ 80%] Built target cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_h16816dgrad_optimized.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_h16816fprop_fixed_channels.a [ 80%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16_static [ 80%] Built target cutlass_library_conv2d_sm80_f16_s16816wgrad_optimized_f16_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_h16816fprop_optimized.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_h16816wgrad_optimized.a [ 80%] Built target cutlass_library_conv2d_sm80_h16816fprop_fixed_channels_static [ 80%] Built target cutlass_library_conv2d_sm80_h16816dgrad_optimized_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_i16832fprop_optimized_s8.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_i16832fprop_optimized_u8.a [ 80%] Built target cutlass_library_conv2d_sm80_h16816wgrad_optimized_static [ 80%] Built target cutlass_library_conv2d_sm80_h16816fprop_optimized_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_i16864fprop_optimized_s4.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_i16864fprop_optimized_u4.a [ 80%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_s8_static [ 80%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_u8_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.a [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.a [ 80%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_s4_static [ 80%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_u4_static [ 80%] Linking CUDA static library libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.a [ 81%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16_static [ 81%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16_static [ 81%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_bf16_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s16816fprop_optimized_f16.a [ 81%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_f16_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.a [ 81%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16_static [ 81%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_bf16_static [ 81%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_f16_static [ 81%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_f16_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688bf16fprop_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688dgrad_optimized.a [ 81%] Built target cutlass_library_conv2d_sm80_s1688bf16wgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688bf16fprop_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688f16dgrad_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688f16fprop_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688f16wgrad_optimized.a [ 81%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688f16wgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688f16dgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688f16fprop_optimized_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688fprop_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688tf32fprop_optimized.a [ 81%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688tf32fprop_optimized_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.a [ 81%] Built target cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688wgrad_optimized.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.a [ 81%] Built target cutlass_library_conv2d_sm80_s1688tf32wgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized_static [ 81%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized_tf32_static [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.a [ 81%] Linking CUDA static library libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.a [ 82%] Built target cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4_static [ 82%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_few_channels_s8_static [ 82%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_fixed_channels_s8_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_sdgrad_optimized.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_swgrad_optimized.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_sfprop_optimized.a [ 82%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.a [ 82%] Built target cutlass_library_conv2d_sm80_swgrad_optimized_static [ 82%] Built target cutlass_library_conv2d_sm80_sfprop_optimized_static [ 82%] Built target cutlass_library_conv2d_sm80_sdgrad_optimized_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.a [ 82%] Built target cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.a [ 82%] Built target cutlass_library_conv2d_sm80_tf32_s1688wgrad_optimized_tf32_static [ 82%] Built target cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.a [ 82%] Built target cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4_static [ 82%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_few_channels_u8_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.a [ 82%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_fixed_channels_u8_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.a [ 82%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e4m3_static [ 82%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e5m2_static [ 82%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.a [ 82%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.a [ 82%] Built target cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16_static [ 82%] Built target cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16_static [ 82%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.a [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.a [ 82%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32_static [ 82%] Linking CUDA static library libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.a [ 82%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32_static [ 82%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.a [ 82%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.a [ 82%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.a [ 82%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_analytic_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_optimized_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_h16816dgrad3d_analytic.a [ 82%] Built target cutlass_library_conv3d_sm80_f16_s16816fprop3d_optimized_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_h16816dgrad3d_optimized.a [ 82%] Built target cutlass_library_conv3d_sm80_f16_s16816wgrad3d_optimized_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_h16816fprop3d_optimized.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_h16816wgrad3d_optimized.a [ 82%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_analytic_static [ 82%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_optimized_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_h16816fprop3d_optimized_static [ 82%] Built target cutlass_library_conv3d_sm80_h16816wgrad3d_optimized_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_bf16_static [ 82%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_f16_static [ 82%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_bf16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.a [ 82%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_bf16_static [ 82%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_f16_static [ 82%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_bf16_static [ 82%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_f16_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.a [ 82%] Built target cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16_static [ 82%] Built target cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16_static [ 82%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32_static [ 82%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32_static [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.a [ 82%] Linking CUDA static library libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.a [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_c1688herk.a [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_c1688syrk.a [ 82%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32_static [ 82%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32_static [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_c1688tf32herk.a [ 82%] Built target cutlass_library_rank_k_sm80_c1688herk_static [ 82%] Built target cutlass_library_rank_k_sm80_c1688syrk_static [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_c1688tf32syrk.a [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_d884syrk.a [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_gz884herk.a [ 82%] Built target cutlass_library_rank_k_sm80_c1688tf32herk_static [ 82%] Built target cutlass_library_rank_k_sm80_c1688tf32syrk_static [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_gz884syrk.a [ 82%] Built target cutlass_library_rank_k_sm80_d884syrk_static [ 82%] Built target cutlass_library_rank_k_sm80_gz884herk_static [ 82%] Linking CUDA static library libcutlass_rank_k_sm80_s1688syrk.a [ 83%] Linking CUDA static library libcutlass_rank_k_sm80_s1688tf32syrk.a [ 83%] Linking CUDA static library libcutlass_rank_k_sm80_z884herk.a [ 83%] Built target cutlass_library_rank_k_sm80_gz884syrk_static [ 83%] Built target cutlass_library_rank_k_sm80_s1688syrk_static [ 83%] Linking CUDA static library libcutlass_rank_k_sm80_z884syrk.a [ 83%] Built target cutlass_library_rank_k_sm80_z884herk_static [ 83%] Built target cutlass_library_rank_k_sm80_s1688tf32syrk_static [ 83%] Linking CUDA static library libcutlass_rank_k_sm90_d1684syrk.a [ 83%] Linking CUDA static library libcutlass_rank_k_sm90_gz1684herk.a [ 83%] Linking CUDA static library libcutlass_rank_k_sm90_gz1684syrk.a [ 83%] Built target cutlass_library_rank_k_sm80_z884syrk_static [ 83%] Built target cutlass_library_rank_k_sm90_d1684syrk_static [ 83%] Linking CUDA static library libcutlass_rank_k_sm90_z1684herk.a [ 83%] Built target cutlass_library_rank_k_sm90_gz1684herk_static [ 83%] Built target cutlass_library_rank_k_sm90_gz1684syrk_static [ 83%] Linking CUDA static library libcutlass_rank_k_sm90_z1684syrk.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_c1688her2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_c1688syr2k.a [ 83%] Built target cutlass_library_rank_k_sm90_z1684herk_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_c1688tf32her2k.a [ 83%] Built target cutlass_library_rank_k_sm90_z1684syrk_static [ 83%] Built target cutlass_library_rank_2k_sm80_c1688her2k_static [ 83%] Built target cutlass_library_rank_2k_sm80_c1688syr2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_c1688tf32syr2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_d884syr2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_gz884her2k.a [ 83%] Built target cutlass_library_rank_2k_sm80_c1688tf32her2k_static [ 83%] Built target cutlass_library_rank_2k_sm80_c1688tf32syr2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_gz884syr2k.a [ 83%] Built target cutlass_library_rank_2k_sm80_d884syr2k_static [ 83%] Built target cutlass_library_rank_2k_sm80_gz884her2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_s1688syr2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_s1688tf32syr2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_z884her2k.a [ 83%] Built target cutlass_library_rank_2k_sm80_gz884syr2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm80_z884syr2k.a [ 83%] Built target cutlass_library_rank_2k_sm80_s1688syr2k_static [ 83%] Built target cutlass_library_rank_2k_sm80_z884her2k_static [ 83%] Built target cutlass_library_rank_2k_sm80_s1688tf32syr2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm90_d1684syr2k.a [ 83%] Built target cutlass_library_rank_2k_sm80_z884syr2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm90_gz1684her2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm90_gz1684syr2k.a [ 83%] Linking CUDA static library libcutlass_rank_2k_sm90_z1684her2k.a [ 83%] Built target cutlass_library_rank_2k_sm90_d1684syr2k_static [ 83%] Built target cutlass_library_rank_2k_sm90_gz1684syr2k_static [ 83%] Built target cutlass_library_rank_2k_sm90_gz1684her2k_static [ 83%] Linking CUDA static library libcutlass_rank_2k_sm90_z1684syr2k.a [ 83%] Linking CUDA static library libcutlass_trmm_sm80_c1688tf32trmm.a [ 83%] Linking CUDA static library libcutlass_trmm_sm80_c1688trmm.a [ 83%] Built target cutlass_library_rank_2k_sm90_z1684her2k_static [ 83%] Linking CUDA static library libcutlass_trmm_sm80_d884trmm.a [ 83%] Built target cutlass_library_rank_2k_sm90_z1684syr2k_static [ 83%] Linking CUDA static library libcutlass_trmm_sm80_gz884trmm.a [ 83%] Built target cutlass_library_trmm_sm80_c1688tf32trmm_static [ 83%] Built target cutlass_library_trmm_sm80_d884trmm_static [ 83%] Built target cutlass_library_trmm_sm80_c1688trmm_static [ 83%] Linking CUDA static library libcutlass_trmm_sm80_s1688tf32trmm.a [ 83%] Linking CUDA static library libcutlass_trmm_sm80_s1688trmm.a [ 83%] Linking CUDA static library libcutlass_trmm_sm80_z884trmm.a [ 83%] Built target cutlass_library_trmm_sm80_gz884trmm_static [ 83%] Linking CUDA static library libcutlass_trmm_sm90_d1684trmm.a [ 83%] Built target cutlass_library_trmm_sm80_s1688tf32trmm_static [ 83%] Built target cutlass_library_trmm_sm80_s1688trmm_static [ 83%] Built target cutlass_library_trmm_sm80_z884trmm_static [ 83%] Linking CUDA static library libcutlass_trmm_sm90_gz1684trmm.a [ 83%] Linking CUDA static library libcutlass_trmm_sm90_z1684trmm.a [ 83%] Built target cutlass_library_trmm_sm90_d1684trmm_static [ 83%] Linking CUDA static library libcutlass_symm_sm80_c1688hemm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_c1688symm.a [ 83%] Built target cutlass_library_trmm_sm90_gz1684trmm_static [ 83%] Built target cutlass_library_symm_sm80_c1688hemm_static [ 83%] Built target cutlass_library_trmm_sm90_z1684trmm_static [ 83%] Built target cutlass_library_symm_sm80_c1688symm_static [ 83%] Linking CUDA static library libcutlass_symm_sm80_c1688tf32symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_c1688tf32hemm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_d884symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_gz884hemm.a [ 83%] Built target cutlass_library_symm_sm80_c1688tf32symm_static [ 83%] Built target cutlass_library_symm_sm80_c1688tf32hemm_static [ 83%] Built target cutlass_library_symm_sm80_d884symm_static [ 83%] Built target cutlass_library_symm_sm80_gz884hemm_static [ 83%] Linking CUDA static library libcutlass_symm_sm80_gz884symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_s1688symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_s1688tf32symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm80_z884hemm.a [ 83%] Built target cutlass_library_symm_sm80_gz884symm_static [ 83%] Built target cutlass_library_symm_sm80_s1688symm_static [ 83%] Built target cutlass_library_symm_sm80_s1688tf32symm_static [ 83%] Built target cutlass_library_symm_sm80_z884hemm_static [ 83%] Linking CUDA static library libcutlass_symm_sm80_z884symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm90_d1684symm.a [ 83%] Linking CUDA static library libcutlass_symm_sm90_gz1684hemm.a [ 83%] Linking CUDA static library libcutlass_symm_sm90_gz1684symm.a [ 83%] Built target cutlass_library_symm_sm80_z884symm_static [ 83%] Built target cutlass_library_symm_sm90_d1684symm_static [ 83%] Built target cutlass_library_symm_sm90_gz1684hemm_static [ 83%] Built target cutlass_library_symm_sm90_gz1684symm_static [ 83%] Linking CUDA static library libcutlass_symm_sm90_z1684hemm.a [ 83%] Linking CUDA shared library libcutlass_symm_sm90_z1684symm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm50_cgemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm50_dgemm.so [ 83%] Built target cutlass_library_symm_sm90_z1684hemm_static [ 83%] Linking CUDA shared library libcutlass_gemm_sm50_sgemm.so [ 83%] Built target cutlass_library_gemm_sm50_sgemm [ 83%] Built target cutlass_library_symm_sm90_z1684symm [ 83%] Built target cutlass_library_gemm_sm50_cgemm [ 83%] Built target cutlass_library_gemm_sm50_dgemm [ 83%] Linking CUDA shared library libcutlass_gemm_sm60_hgemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm61_igemm_s8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm61_s8_igemm_s8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_f16_s884gemm_f16.so [ 83%] Built target cutlass_library_gemm_sm60_hgemm [ 83%] Built target cutlass_library_gemm_sm61_igemm_s8 [ 83%] Built target cutlass_library_gemm_sm61_s8_igemm_s8 [ 83%] Built target cutlass_library_gemm_sm70_f16_s884gemm_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_h884gemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_h884gemm_planar_complex.so [ 83%] Built target cutlass_library_gemm_sm70_h884gemm [ 83%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_array_f16 [ 83%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex [ 83%] Built target cutlass_library_gemm_sm70_f16_s884gemm_planar_complex_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_h884gemm_planar_complex_array.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_s884gemm_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm70_s884gemm_planar_complex_f16.so [ 83%] Built target cutlass_library_gemm_sm70_s884gemm_f16 [ 83%] Built target cutlass_library_gemm_sm70_h884gemm_planar_complex_array [ 83%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_array_f16 [ 83%] Built target cutlass_library_gemm_sm70_s884gemm_planar_complex_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_f16_s1688gemm_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_h1688gemm.so [ 83%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_f16 [ 83%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_array_f16 [ 83%] Built target cutlass_library_gemm_sm75_h1688gemm [ 83%] Built target cutlass_library_gemm_sm75_f16_s1688gemm_planar_complex_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_h1688gemm_planar_complex.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_h1688gemm_planar_complex_array.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_i88128xorgemm_b1.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_i8816gemm_s8.so [ 83%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex [ 83%] Built target cutlass_library_gemm_sm75_i8816gemm_s8 [ 83%] Built target cutlass_library_gemm_sm75_i88128xorgemm_b1 [ 83%] Built target cutlass_library_gemm_sm75_h1688gemm_planar_complex_array [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_i8816gemm_u8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_i8832gemm_s4.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_i8832gemm_u4.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_s1688gemm_f16.so [ 83%] Built target cutlass_library_gemm_sm75_i8816gemm_u8 [ 83%] Built target cutlass_library_gemm_sm75_i8832gemm_u4 [ 83%] Built target cutlass_library_gemm_sm75_i8832gemm_s4 [ 83%] Built target cutlass_library_gemm_sm75_s1688gemm_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_s4_i8832gemm_s4.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_s8_i8816gemm_s8.so [ 83%] Built target cutlass_library_gemm_sm75_s4_i8832gemm_s4 [ 83%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_array_f16 [ 83%] Built target cutlass_library_gemm_sm75_s8_i8816gemm_s8 [ 83%] Built target cutlass_library_gemm_sm75_s1688gemm_planar_complex_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_u4_i8832gemm_u4.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm75_u8_i8816gemm_u8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_bf16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.so [ 83%] Built target cutlass_library_gemm_sm75_u4_i8832gemm_u4 [ 83%] Built target cutlass_library_gemm_sm75_u8_i8816gemm_u8 [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_s8 [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.so [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_bf16_u8 [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_s8_bf16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.so [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16 [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_planar_complex_bf16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_c1688gemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_c1688tf32gemm.so [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16816gemm_u8_bf16 [ 83%] Built target cutlass_library_gemm_sm80_bf16_s16832spgemm_bf16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_cgemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_d884gemm.so [ 83%] Built target cutlass_library_gemm_sm80_c1688gemm [ 83%] Built target cutlass_library_gemm_sm80_c1688tf32gemm [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_dgemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_f16.so [ 83%] Built target cutlass_library_gemm_sm80_d884gemm [ 83%] Built target cutlass_library_gemm_sm80_cgemm [ 83%] Built target cutlass_library_gemm_sm80_dgemm [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.so [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.so [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_s8 [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_f16_u8 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.so [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_array_f16 [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_planar_complex_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_f16_s16832spgemm_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_gz884gemm.so [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_s8_f16 [ 83%] Built target cutlass_library_gemm_sm80_f16_s16816gemm_u8_f16 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16816gemm.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16816gemm_grouped.so [ 83%] Built target cutlass_library_gemm_sm80_f16_s16832spgemm_f16 [ 83%] Built target cutlass_library_gemm_sm80_gz884gemm [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16816gemm_planar_complex.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16816gemm_planar_complex_array.so [ 83%] Built target cutlass_library_gemm_sm80_h16816gemm [ 83%] Built target cutlass_library_gemm_sm80_h16816gemm_grouped [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16816gemm_s8_f16.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_h16832spgemm.so [ 83%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex [ 83%] Built target cutlass_library_gemm_sm80_h16816gemm_planar_complex_array [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_i168128spgemm_s4.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_i168256andgemm_b1.so [ 83%] Built target cutlass_library_gemm_sm80_h16816gemm_s8_f16 [ 83%] Built target cutlass_library_gemm_sm80_h16832spgemm [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_i168256xorgemm_b1.so [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_i16832gemm_s8.so [ 83%] Built target cutlass_library_gemm_sm80_i168128spgemm_s4 [ 83%] Built target cutlass_library_gemm_sm80_i168256andgemm_b1 [ 83%] Linking CUDA shared library libcutlass_gemm_sm80_i16832gemm_u8.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_i16864gemm_s4.so [ 84%] Built target cutlass_library_gemm_sm80_i168256xorgemm_b1 [ 84%] Built target cutlass_library_gemm_sm80_i16832gemm_s8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_i16864gemm_u4.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_i16864spgemm_s8.so [ 84%] Built target cutlass_library_gemm_sm80_i16832gemm_u8 [ 84%] Built target cutlass_library_gemm_sm80_i16864gemm_s4 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_bf16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_bf16_s8.so [ 84%] Built target cutlass_library_gemm_sm80_i16864gemm_u4 [ 84%] Built target cutlass_library_gemm_sm80_i16864spgemm_s8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_bf16_u8.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_f16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_s8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_f16_s8.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_f16_u8.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_bf16_u8 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_grouped_bf16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_grouped_f16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_s8 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_f16_u8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_bf16 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_grouped_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_bf16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_s8_bf16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_array_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_s8_f16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_bf16 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_planar_complex_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_u8_bf16.so [ 84%] Linking CUDA shared library libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_bf16 [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_s8_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816gemm_u8_f16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16816tf32spgemm.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_bf16 [ 84%] Built target cutlass_library_conv2d_sm75_cf32_cfprop_optimized_cf32 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16832spgemm_bf16.so [ 84%] Built target cutlass_library_gemm_sm80_s16816gemm_u8_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s16832spgemm_f16.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s1688bf16gemm.so [ 84%] Built target cutlass_library_gemm_sm80_s16816tf32spgemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s1688f16gemm.so [ 84%] Built target cutlass_library_gemm_sm80_s16832spgemm_bf16 [ 84%] Built target cutlass_library_gemm_sm80_s16832spgemm_f16 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s1688gemm.so [ 84%] Built target cutlass_library_gemm_sm80_s1688bf16gemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s1688gemm_tf32.so [ 84%] Built target cutlass_library_gemm_sm80_s1688f16gemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s1688tf32gemm.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s4_i168128spgemm_s4.so [ 84%] Built target cutlass_library_gemm_sm80_s1688gemm [ 84%] Built target cutlass_library_gemm_sm80_s1688gemm_tf32 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s4_i16864gemm_s4.so [ 84%] Built target cutlass_library_gemm_sm80_s1688tf32gemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s8_i16832gemm_s8.so [ 84%] Built target cutlass_library_gemm_sm80_s4_i168128spgemm_s4 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_s8_i16864spgemm_s8.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_sgemm.so [ 84%] Built target cutlass_library_gemm_sm80_s4_i16864gemm_s4 [ 84%] Built target cutlass_library_gemm_sm80_s8_i16832gemm_s8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_tf32_s1688gemm_tf32.so [ 84%] Built target cutlass_library_gemm_sm80_s8_i16864spgemm_s8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_u4_i16864gemm_u4.so [ 84%] Built target cutlass_library_gemm_sm80_sgemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_u8_i16832gemm_u8.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm80_z884gemm.so [ 84%] Built target cutlass_library_gemm_sm80_tf32_s1688gemm_tf32 [ 84%] Built target cutlass_library_gemm_sm80_u4_i16864gemm_u4 [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.so [ 84%] Built target cutlass_library_gemm_sm80_u8_i16832gemm_u8 [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.so [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.so [ 84%] Built target cutlass_library_gemm_sm80_z884gemm [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.so [ 84%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3 [ 84%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2 [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832gemm_e4m3.so [ 84%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2 [ 84%] Linking CUDA shared library libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16832gemm_e5m2.so [ 85%] Built target cutlass_library_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.so [ 85%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.so [ 85%] Built target cutlass_library_gemm_sm89_s16832gemm_e4m3_e5m2 [ 85%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.so [ 85%] Built target cutlass_library_gemm_sm89_s16832gemm_e5m2_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.so [ 85%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3 [ 85%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864spgemm_e4m3.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.so [ 85%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864spgemm_e5m2.so [ 85%] Built target cutlass_library_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.so [ 85%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3 [ 85%] Built target cutlass_library_gemm_sm89_s16864spgemm_e4m3_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.so [ 85%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.so [ 85%] Built target cutlass_library_gemm_sm89_s16864spgemm_e5m2_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.so [ 85%] Built target cutlass_library_gemm_sm90_bf16_s64x128x16gemm_bf16 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.so [ 85%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_d1684gemm.so [ 85%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2 [ 85%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.so [ 85%] Built target cutlass_library_gemm_sm90_d1684gemm [ 85%] Built target cutlass_library_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.so [ 85%] Built target cutlass_library_gemm_sm90_f16_s64x128x16gemm_f16 [ 85%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.so [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_gz1684gemm.so [ 85%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2 [ 85%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2 [ 85%] Linking CUDA shared library libcutlass_gemm_sm90_h64x128x16gemm.so [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_i64x128x32gemm_s8.so [ 86%] Built target cutlass_library_gemm_sm90_gz1684gemm [ 86%] Built target cutlass_library_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_i64x128x32gemm_u8.so [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x16gemm_bf16.so [ 86%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_s8 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x16gemm_f16.so [ 86%] Built target cutlass_library_gemm_sm90_h64x128x16gemm [ 86%] Built target cutlass_library_gemm_sm90_i64x128x32gemm_u8 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x32gemm_e4m3.so [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.so [ 86%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_bf16 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x32gemm_e5m2.so [ 86%] Built target cutlass_library_gemm_sm90_s64x128x16gemm_f16 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.so [ 86%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3 [ 86%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e4m3_e5m2 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x8gemm_tf32.so [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s64x128x8tf32gemm.so [ 86%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2 [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.so [ 86%] Built target cutlass_library_gemm_sm90_s64x128x32gemm_e5m2_e4m3 [ 86%] Built target cutlass_library_gemm_sm90_s64x128x8gemm_tf32 [ 86%] Built target cutlass_library_gemm_sm90_s64x128x8tf32gemm [ 86%] Linking CUDA shared library libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.so [ 87%] Linking CUDA shared library libcutlass_gemm_sm90_void_i64x128x32gemm_s8.so [ 87%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_s8 [ 87%] Linking CUDA shared library libcutlass_gemm_sm90_void_i64x128x32gemm_u8.so [ 87%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.so [ 87%] Built target cutlass_library_gemm_sm90_s8_i64x128x32gemm_u8 [ 87%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_s8 [ 87%] Built target cutlass_library_gemm_sm90_void_i64x128x32gemm_u8 [ 87%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x16gemm_f16.so [ 88%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.so [ 88%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.so [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16 [ 88%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.so [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3 [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x16gemm_f16 [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2 [ 88%] Linking CUDA shared library libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.so [ 88%] Linking CUDA shared library libcutlass_gemm_sm90_z1684gemm.so [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.so [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2 [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.so [ 88%] Built target cutlass_library_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3 [ 88%] Built target cutlass_library_conv2d_sm50_cf32_cdgrad_optimized_cf32 [ 88%] Built target cutlass_library_gemm_sm90_z1684gemm [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.so [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_sfprop_optimized.so [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_sdgrad_optimized.so [ 88%] Built target cutlass_library_conv2d_sm50_cf32_cfprop_optimized_cf32 [ 88%] Linking CUDA shared library libcutlass_conv2d_sm50_swgrad_optimized.so [ 88%] Built target cutlass_library_conv2d_sm50_cf32_cwgrad_optimized_cf32 [ 88%] Built target cutlass_library_conv2d_sm50_sfprop_optimized [ 88%] Built target cutlass_library_conv2d_sm50_sdgrad_optimized [ 88%] Linking CUDA shared library libcutlass_conv2d_sm60_hfprop_optimized.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm50_swgrad_optimized [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm60_hfprop_optimized [ 89%] Built target cutlass_library_conv2d_sm70_f16_s884dgrad_optimized_f16 [ 89%] Built target cutlass_library_conv2d_sm70_f16_s884fprop_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_h884dgrad_optimized.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_h884fprop_optimized.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_h884wgrad_optimized.so [ 89%] Built target cutlass_library_conv2d_sm70_f16_s884wgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_s884dgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm70_h884dgrad_optimized [ 89%] Built target cutlass_library_conv2d_sm70_h884fprop_optimized [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_s884fprop_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm70_h884wgrad_optimized [ 89%] Built target cutlass_library_conv2d_sm70_s884dgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm70_s884wgrad_optimized_f16.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.so [ 89%] Built target cutlass_library_conv2d_sm70_s884fprop_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm70_s884wgrad_optimized_f16 [ 89%] Built target cutlass_library_conv2d_sm75_cf32_cdgrad_optimized_cf32 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_cf32_cwgrad_optimized_cf32 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_f16_s1688dgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_few_channels_f16 [ 89%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_fixed_channels_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_h1688dgrad_optimized.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_h1688fprop_few_channels.so [ 89%] Built target cutlass_library_conv2d_sm75_f16_s1688fprop_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_h1688fprop_fixed_channels.so [ 89%] Built target cutlass_library_conv2d_sm75_f16_s1688wgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_h1688fprop_optimized.so [ 89%] Built target cutlass_library_conv2d_sm75_h1688dgrad_optimized [ 89%] Built target cutlass_library_conv2d_sm75_h1688fprop_few_channels [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_h1688wgrad_optimized.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_i8816fprop_optimized_s8.so [ 89%] Built target cutlass_library_conv2d_sm75_h1688fprop_fixed_channels [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_i8816fprop_optimized_u8.so [ 89%] Built target cutlass_library_conv2d_sm75_h1688fprop_optimized [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_i8832fprop_optimized_s4.so [ 89%] Built target cutlass_library_conv2d_sm75_h1688wgrad_optimized [ 89%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_s8 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_i8832fprop_optimized_u4.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_i8816fprop_optimized_u8 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_s4 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_i8832fprop_optimized_u4 [ 89%] Built target cutlass_library_conv2d_sm75_s1688dgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s1688fprop_optimized_f16.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.so [ 89%] Built target cutlass_library_conv2d_sm75_s1688fprop_few_channels_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.so [ 89%] Built target cutlass_library_conv2d_sm75_s1688fprop_fixed_channels_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.so [ 89%] Built target cutlass_library_conv2d_sm75_s1688fprop_optimized_f16 [ 89%] Built target cutlass_library_conv2d_sm75_s1688wgrad_optimized_f16 [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.so [ 89%] Linking CUDA shared library libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.so [ 89%] Built target cutlass_library_conv2d_sm75_s4_i8832fprop_optimized_s4 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.so [ 90%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_few_channels_s8 [ 90%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_fixed_channels_s8 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.so [ 90%] Built target cutlass_library_conv2d_sm75_s8_i8816fprop_optimized_s8 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.so [ 90%] Linking CUDA shared library libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.so [ 90%] Built target cutlass_library_conv2d_sm75_u4_i8832fprop_optimized_u4 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.so [ 90%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_few_channels_u8 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.so [ 90%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_fixed_channels_u8 [ 90%] Built target cutlass_library_conv2d_sm75_u8_i8816fprop_optimized_u8 [ 90%] Linking CUDA shared library libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.so [ 90%] Linking CUDA shared library libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.so [ 90%] Built target cutlass_library_conv2d_sm80_bf16_s16816dgrad_optimized_bf16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.so [ 91%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.so [ 91%] Built target cutlass_library_conv2d_sm80_bf16_s16816wgrad_optimized_bf16 [ 91%] Built target cutlass_library_conv2d_sm80_bf16_s16816fprop_optimized_bf16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.so [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.so [ 91%] Built target cutlass_library_conv2d_sm80_f16_s16816dgrad_optimized_f16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_h16816dgrad_optimized.so [ 91%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_fixed_channels_f16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_h16816fprop_fixed_channels.so [ 91%] Built target cutlass_library_conv2d_sm80_f16_s16816wgrad_optimized_f16 [ 91%] Built target cutlass_library_conv2d_sm80_f16_s16816fprop_optimized_f16 [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_h16816fprop_optimized.so [ 91%] Linking CUDA shared library libcutlass_conv2d_sm80_h16816wgrad_optimized.so [ 91%] Built target cutlass_library_conv2d_sm80_h16816dgrad_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_i16832fprop_optimized_s8.so [ 92%] Built target cutlass_library_conv2d_sm80_h16816fprop_fixed_channels [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_i16832fprop_optimized_u8.so [ 92%] Built target cutlass_library_conv2d_sm80_h16816wgrad_optimized [ 92%] Built target cutlass_library_conv2d_sm80_h16816fprop_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_i16864fprop_optimized_s4.so [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_i16864fprop_optimized_u4.so [ 92%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_s8 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.so [ 92%] Built target cutlass_library_conv2d_sm80_i16832fprop_optimized_u8 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.so [ 92%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_s4 [ 92%] Built target cutlass_library_conv2d_sm80_i16864fprop_optimized_u4 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.so [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_bf16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816dgrad_optimized_f16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816fprop_optimized_f16.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_f16 [ 92%] Built target cutlass_library_conv2d_sm80_s16816fprop_fixed_channels_bf16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.so [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_bf16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816fprop_optimized_f16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688bf16fprop_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_bf16 [ 92%] Built target cutlass_library_conv2d_sm80_s16816wgrad_optimized_f16 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688bf16dgrad_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688dgrad_optimized.so [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688bf16fprop_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688f16dgrad_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688bf16wgrad_optimized [ 92%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688f16fprop_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688dgrad_optimized_tf32 [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688f16wgrad_optimized.so [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688fprop_optimized.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688f16dgrad_optimized [ 92%] Built target cutlass_library_conv2d_sm80_s1688f16fprop_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.so [ 92%] Built target cutlass_library_conv2d_sm80_s1688f16wgrad_optimized [ 92%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.so [ 93%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688tf32fprop_optimized.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.so [ 93%] Built target cutlass_library_conv2d_sm80_s1688fprop_optimized_tf32 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688wgrad_optimized.so [ 93%] Built target cutlass_library_conv2d_sm80_s1688tf32dgrad_optimized [ 93%] Built target cutlass_library_conv2d_sm80_s1688tf32fprop_optimized [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.so [ 93%] Built target cutlass_library_conv2d_sm80_s1688tf32wgrad_optimized [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.so [ 93%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized [ 93%] Built target cutlass_library_conv2d_sm80_s1688wgrad_optimized_tf32 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.so [ 93%] Built target cutlass_library_conv2d_sm80_s4_i16864fprop_optimized_s4 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.so [ 93%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_few_channels_s8 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_sdgrad_optimized.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_sfprop_optimized.so [ 93%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_fixed_channels_s8 [ 93%] Built target cutlass_library_conv2d_sm80_s8_i16832fprop_optimized_s8 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_swgrad_optimized.so [ 93%] Built target cutlass_library_conv2d_sm80_sdgrad_optimized [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.so [ 93%] Built target cutlass_library_conv2d_sm80_sfprop_optimized [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.so [ 93%] Built target cutlass_library_conv2d_sm80_swgrad_optimized [ 93%] Built target cutlass_library_conv2d_sm80_tf32_s1688dgrad_optimized_tf32 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.so [ 93%] Built target cutlass_library_conv2d_sm80_tf32_s1688fprop_optimized_tf32 [ 93%] Built target cutlass_library_conv2d_sm80_tf32_s1688wgrad_optimized_tf32 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.so [ 93%] Built target cutlass_library_conv2d_sm80_u4_i16864fprop_optimized_u4 [ 93%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_few_channels_u8 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.so [ 93%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_fixed_channels_u8 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.so [ 93%] Built target cutlass_library_conv2d_sm80_u8_i16832fprop_optimized_u8 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.so [ 93%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e4m3 [ 93%] Built target cutlass_library_conv2d_sm89_s16832fprop_fixed_channels_e5m2 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.so [ 93%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e4m3 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.so [ 93%] Built target cutlass_library_conv2d_sm89_s16832fprop_optimized_e5m2 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.so [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.so [ 93%] Built target cutlass_library_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.so [ 93%] Built target cutlass_library_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16 [ 93%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32 [ 93%] Linking CUDA shared library libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.so [ 93%] Built target cutlass_library_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32 [ 93%] Linking CUDA shared library libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.so [ 93%] Linking CUDA shared library libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.so [ 93%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32 [ 93%] Linking CUDA shared library libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.so [ 93%] Built target cutlass_library_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32 [ 93%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.so [ 94%] Built target cutlass_library_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.so [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.so [ 94%] Built target cutlass_library_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.so [ 94%] Built target cutlass_library_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16 [ 94%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_analytic_f16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.so [ 94%] Built target cutlass_library_conv3d_sm80_f16_s16816dgrad3d_optimized_f16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_h16816dgrad3d_analytic.so [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_h16816dgrad3d_optimized.so [ 94%] Built target cutlass_library_conv3d_sm80_f16_s16816fprop3d_optimized_f16 [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_h16816fprop3d_optimized.so [ 94%] Built target cutlass_library_conv3d_sm80_f16_s16816wgrad3d_optimized_f16 [ 94%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_analytic [ 94%] Linking CUDA shared library libcutlass_conv3d_sm80_h16816wgrad3d_optimized.so [ 94%] Built target cutlass_library_conv3d_sm80_h16816dgrad3d_optimized [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.so [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.so [ 95%] Built target cutlass_library_conv3d_sm80_h16816fprop3d_optimized [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.so [ 95%] Built target cutlass_library_conv3d_sm80_h16816wgrad3d_optimized [ 95%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_bf16 [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.so [ 95%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_analytic_f16 [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.so [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.so [ 95%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_bf16 [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.so [ 95%] Built target cutlass_library_conv3d_sm80_s16816dgrad3d_optimized_f16 [ 95%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_bf16 [ 95%] Linking CUDA shared library libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.so [ 95%] Built target cutlass_library_conv3d_sm80_s16816fprop3d_optimized_f16 [ 95%] Linking CUDA shared library libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.so [ 96%] Linking CUDA shared library libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.so [ 96%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_bf16 [ 96%] Linking CUDA shared library libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.so [ 96%] Built target cutlass_library_conv3d_sm80_s16816wgrad3d_optimized_f16 [ 96%] Built target cutlass_library_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16 [ 96%] Linking CUDA shared library libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.so [ 96%] Built target cutlass_library_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16 [ 96%] Linking CUDA shared library libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.so [ 96%] Linking CUDA shared library libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.so [ 96%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32 [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_c1688herk.so [ 96%] Built target cutlass_library_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32 [ 96%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32 [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_c1688syrk.so [ 96%] Built target cutlass_library_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32 [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_c1688tf32herk.so [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_c1688tf32syrk.so [ 96%] Built target cutlass_library_rank_k_sm80_c1688herk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_d884syrk.so [ 96%] Built target cutlass_library_rank_k_sm80_c1688syrk [ 96%] Built target cutlass_library_rank_k_sm80_c1688tf32herk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_gz884herk.so [ 96%] Built target cutlass_library_rank_k_sm80_c1688tf32syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_gz884syrk.so [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_s1688syrk.so [ 96%] Built target cutlass_library_rank_k_sm80_d884syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_s1688tf32syrk.so [ 96%] Built target cutlass_library_rank_k_sm80_gz884herk [ 96%] Built target cutlass_library_rank_k_sm80_gz884syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_z884herk.so [ 96%] Built target cutlass_library_rank_k_sm80_s1688syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm80_z884syrk.so [ 96%] Linking CUDA shared library libcutlass_rank_k_sm90_d1684syrk.so [ 96%] Built target cutlass_library_rank_k_sm80_s1688tf32syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm90_gz1684herk.so [ 96%] Built target cutlass_library_rank_k_sm80_z884herk [ 96%] Built target cutlass_library_rank_k_sm80_z884syrk [ 96%] Linking CUDA shared library libcutlass_rank_k_sm90_gz1684syrk.so [ 97%] Linking CUDA shared library libcutlass_rank_k_sm90_z1684herk.so [ 97%] Built target cutlass_library_rank_k_sm90_d1684syrk [ 97%] Linking CUDA shared library libcutlass_rank_k_sm90_z1684syrk.so [ 97%] Built target cutlass_library_rank_k_sm90_gz1684herk [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_c1688her2k.so [ 97%] Built target cutlass_library_rank_k_sm90_gz1684syrk [ 97%] Built target cutlass_library_rank_k_sm90_z1684herk [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_c1688syr2k.so [ 97%] Built target cutlass_library_rank_k_sm90_z1684syrk [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_c1688tf32her2k.so [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_c1688tf32syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_c1688her2k [ 97%] Built target cutlass_library_rank_2k_sm80_c1688syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_d884syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_c1688tf32her2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_gz884her2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_c1688tf32syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_gz884syr2k.so [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_s1688syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_d884syr2k [ 97%] Built target cutlass_library_rank_2k_sm80_gz884her2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_s1688tf32syr2k.so [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_z884her2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_gz884syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm80_z884syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_s1688syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm90_d1684syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_s1688tf32syr2k [ 97%] Built target cutlass_library_rank_2k_sm80_z884her2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm90_gz1684her2k.so [ 97%] Built target cutlass_library_rank_2k_sm80_z884syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm90_gz1684syr2k.so [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm90_z1684her2k.so [ 97%] Built target cutlass_library_rank_2k_sm90_d1684syr2k [ 97%] Linking CUDA shared library libcutlass_rank_2k_sm90_z1684syr2k.so [ 97%] Built target cutlass_library_rank_2k_sm90_gz1684her2k [ 97%] Built target cutlass_library_rank_2k_sm90_gz1684syr2k [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_c1688tf32trmm.so [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_c1688trmm.so [ 97%] Built target cutlass_library_rank_2k_sm90_z1684her2k [ 97%] Built target cutlass_library_rank_2k_sm90_z1684syr2k [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_d884trmm.so [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_gz884trmm.so [ 97%] Built target cutlass_library_trmm_sm80_c1688tf32trmm [ 97%] Built target cutlass_library_trmm_sm80_c1688trmm [ 97%] Built target cutlass_library_trmm_sm80_d884trmm [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_s1688tf32trmm.so [ 97%] Linking CUDA shared library libcutlass_trmm_sm80_s1688trmm.so [ 98%] Linking CUDA shared library libcutlass_trmm_sm80_z884trmm.so [ 98%] Built target cutlass_library_trmm_sm80_gz884trmm [ 98%] Linking CUDA shared library libcutlass_trmm_sm90_d1684trmm.so [ 98%] Built target cutlass_library_trmm_sm80_s1688tf32trmm [ 98%] Built target cutlass_library_trmm_sm80_s1688trmm [ 98%] Linking CUDA shared library libcutlass_trmm_sm90_gz1684trmm.so [ 98%] Built target cutlass_library_trmm_sm80_z884trmm [ 98%] Linking CUDA shared library libcutlass_trmm_sm90_z1684trmm.so [ 98%] Linking CUDA shared library libcutlass_symm_sm80_c1688hemm.so [ 98%] Built target cutlass_library_trmm_sm90_d1684trmm [ 98%] Linking CUDA shared library libcutlass_symm_sm80_c1688symm.so [ 98%] Built target cutlass_library_trmm_sm90_gz1684trmm [ 98%] Built target cutlass_library_symm_sm80_c1688hemm [ 98%] Built target cutlass_library_trmm_sm90_z1684trmm [ 98%] Linking CUDA shared library libcutlass_symm_sm80_c1688tf32hemm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm80_c1688tf32symm.so [ 99%] Built target cutlass_library_symm_sm80_c1688symm [ 99%] Linking CUDA shared library libcutlass_symm_sm80_d884symm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm80_gz884hemm.so [ 99%] Built target cutlass_library_symm_sm80_c1688tf32hemm [ 99%] Built target cutlass_library_symm_sm80_c1688tf32symm [ 99%] Built target cutlass_library_symm_sm80_d884symm [ 99%] Linking CUDA shared library libcutlass_symm_sm80_gz884symm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm80_s1688symm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm80_s1688tf32symm.so [ 99%] Built target cutlass_library_symm_sm80_gz884hemm [ 99%] Linking CUDA shared library libcutlass_symm_sm80_z884hemm.so [ 99%] Built target cutlass_library_symm_sm80_gz884symm [ 99%] Built target cutlass_library_symm_sm80_s1688symm [ 99%] Built target cutlass_library_symm_sm80_s1688tf32symm [ 99%] Linking CUDA shared library libcutlass_symm_sm80_z884symm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm90_d1684symm.so [ 99%] Linking CUDA shared library libcutlass_symm_sm90_gz1684hemm.so [ 99%] Built target cutlass_library_symm_sm80_z884hemm [ 99%] Linking CUDA shared library libcutlass_symm_sm90_gz1684symm.so [ 99%] Built target cutlass_library_symm_sm80_z884symm [ 99%] Built target cutlass_library_symm_sm90_d1684symm [ 99%] Built target cutlass_library_symm_sm90_gz1684hemm [ 99%] Linking CUDA shared library libcutlass_symm_sm90_z1684hemm.so [ 99%] Linking CXX static library libcutlass.a [ 99%] Built target cutlass_library_symm_sm90_gz1684symm [ 99%] Built target cutlass_library_symm_sm90_z1684hemm [ 99%] Linking CXX shared library libcutlass.so [ 99%] Built target cutlass_library [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/main.cpp.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/options.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/cutlass_profiler.cu.o [ 99%] Built target cutlass_library_static [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/performance_report.cpp.o [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/enumerated_types.cpp.o [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/gpu_timer.cpp.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/device_allocation.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/device_context.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/cublas_helpers.cu.o [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/cudnn_helpers.cpp.o [ 99%] Building CXX object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/problem_space.cpp.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/gemm_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/rank_k_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/rank_2k_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/trmm_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/symm_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/conv2d_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/conv3d_operation_profiler.cu.o [ 99%] Building CUDA object tools/profiler/CMakeFiles/cutlass_profiler.dir/src/sparse_gemm_operation_profiler.cu.o [100%] Linking CXX executable cutlass_profiler [100%] Built target cutlass_profiler + popd ~/build/BUILD/cutlass-3.5.0-build/cutlass + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.7GJm7G + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + '[' /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT ++ dirname /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT + mkdir -p /builddir/build/BUILD/cutlass-3.5.0-build + mkdir /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT + CFLAGS=' ' + export CFLAGS + CXXFLAGS=' ' + export CXXFLAGS + FFLAGS=' -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS=' -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd cutlass + rm -rf /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT ~/build/BUILD/cutlass-3.5.0-build/cutlass/build ~/build/BUILD/cutlass-3.5.0-build/cutlass + pushd build + DESTDIR=/builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT + /usr/bin/cmake --install . -- Install configuration: "Release" -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/functional.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/functional.h.fp16~ -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/workspace.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/wmma_array.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/version.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/uint128.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/warp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/warp/vector_fragment_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/vector_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_iterator_tensor_op_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_iterator_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_iterator_pitch_linear_2dthreadtile.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_iterator_pitch_linear.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_access_iterator_tensor_op_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_access_iterator_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_access_iterator_pitch_linear_direct_conv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_access_iterator_pitch_linear.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_tile_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/regular_scale_bias_vector_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_vector_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_iterator_triangular_matrix.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_iterator_2dthreadtile.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_access_iterator_triangular_matrix.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_access_iterator_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_access_iterator_2dthreadtile.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_tile_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_scale_bias_vector_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/predicated_scale_bias_vector_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/ell_predicated_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/ell_predicated_tile_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/threadblock/ell_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/thread/unary_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/thread/transpose.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/pitch_linear_thread_map.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/collective -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/transform/collective/sm90_wgmma_transpose.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/trace.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/thread/matrix.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tfloat32.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tensor_view_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tensor_view.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tensor_ref_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tensor_ref.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/tensor_coord.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/subbyte_reference.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/semaphore.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/relatively_equal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/threadblock_swizzle.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/thread/reduction_operators.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/thread/reduce.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/kernel -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/kernel/tensor_reduce_affine_strided.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/kernel/tensor_reduce_affine_contiguous.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/kernel/reduce_split_k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/kernel/reduce_softmax_final.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/device -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/device/tensor_reduce_affine_strided.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/device/tensor_reduce_affine_contiguous.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/device/tensor_reduce.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/reduction/device/reduce_split_k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/real.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/quaternion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/predicate_vector.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/platform -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/platform/platform.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/pitch_linear_coord.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/pipeline -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/pipeline/sm90_pipeline.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/pipeline/pipeline.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/numeric_types.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/numeric_size.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/numeric_conversion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/matrix_shape.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/matrix_coord.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/matrix.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/vector.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/tensor_op_multiplicand_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/tensor_op_multiplicand_sm75.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/tensor_op_multiplicand_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/tensor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/pitch_linear.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/permute.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/matrix.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/layout/layout.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/kernel_launch.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/kernel_hardware_info.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/kernel_hardware_info.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/integer_subbyte.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/half.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm_coord.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm_coord.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/tile_iterator_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/softmax_scale_bias_transform.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/scale_bias_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_with_reduction_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_wmma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_iterator_wmma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_iterator_sparse.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_iterator_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_iterator_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_tile_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_fragment_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op_fast_f32.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_sparse_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_simt_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_simt_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_mixed_input_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_gaussian_complex_tensor_op_tile_iterator_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_gaussian_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_complex_tensor_op_tile_iterator_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_complex_tensor_op_fast_f32.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/layernorm_scale_bias_transform.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_wmma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_with_reduction_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_tensor_op_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_sparse_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/warp/default_mma_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/threadblock_swizzle_streamk.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/threadblock_swizzle.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_with_reduction_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_sparse_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_sparse_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_softmax_mainloop_fusion_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_singlestage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_planar_complex_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_planar_complex_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_planar_complex_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_layernorm_mainloop_fusion_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_blas3_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/mma_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/index_remat.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/gemv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/ell_mma_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/ell_mma_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_trmm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_sparse_mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_multistage_trmm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_multistage_mma_complex_core_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_multistage_mma_complex_core.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_multistage_mma_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_softmax_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_planar_complex_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_planar_complex_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_layernorm_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_wmma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_with_access_size.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_sparse_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_sm75.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma_core.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_gemv_core.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/threadblock/default_ell_mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/thread/mma_sm61.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/thread/mma_sm60.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/thread/mma_sm50.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/thread/mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/trmm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/tile_scheduler_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/tile_scheduler.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/symm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/static_tile_scheduler.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sparse_gemm_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sparse_gemm_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sparse_gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_tile_scheduler_stream_k.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_tile_scheduler_group.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_tile_scheduler.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_warpspecialized_pingpong.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_warpspecialized_cooperative.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_pingpong.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_tma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm90_gemm_array_tma_warpspecialized_cooperative.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/sm70_gemm.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/rank_k_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/rank_2k_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/rank_2k_transpose_operands.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/rank_2k_grouped_problem_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/rank_2k_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/params_universal_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/params_sparse_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/grouped_problem_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemv_batched_strided.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_with_k_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_with_fused_epilogue.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_universal_with_visitor_streamk.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_universal_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_universal_streamk.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_universal.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_transpose_operands.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_streamk_with_fused_epilogue.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_splitk_parallel.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_planar_complex_array.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_layernorm_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_grouped_softmax_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_grouped_problem_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_batched.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm_array.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/ell_gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_trmm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_trmm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_trmm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_symm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_symm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_symm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_k_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_k_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_2k_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_2k_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_2k_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_rank_2k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_with_k_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_universal_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_streamk_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_splitk_parallel.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_sparse_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_sparse_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_sparse.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_planar_complex_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_layernorm_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_grouped_softmax_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/kernel/default_ell_gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/group_array_problem_shape.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/gemm_enumerated_types.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/dispatch_policy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/trmm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/symm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/rank_k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/rank_2k_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/rank_2k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_with_k_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal_streamk_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal_adapter.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_universal.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_splitk_parallel.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_sparse_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_sparse_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_sparse.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_layernorm_mainloop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_batched.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm_array.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/ell_gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/default_gemm_configuration.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/device/base_grouped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss_warpspecialized_fp8.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_tma_gmma_rs_warpspecialized_mixed_input.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_tma_gmma_rs_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_multistage_gmma_ss_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_multistage_gmma_rs_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm90_mma_array_tma_gmma_ss_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm80_mma_multistage.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/sm70_mma_twostage.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/fp8_accumulation.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/collective_mma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/collective_builder.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/builders -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/builders/sm90_gmma_builder.inl -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/gemm/collective/builders/sm90_common.inl -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/floating_point_nvrtc.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/float8.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/fast_math.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/wmma_tensor_op_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/volta_tensor_op_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tile_iterator_wmma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tile_iterator_volta_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tile_iterator_tensor_op_mixed.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tile_iterator_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tile_iterator_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/tensor_op_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/simt_policy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_wmma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_volta_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_gaussian_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/warp/fragment_iterator_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/shared_load_iterator_pitch_linear.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/shared_load_iterator_mixed.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/shared_load_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_strided_dgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_predicates.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_direct_conv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_conv.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_blas3.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_affine_layout_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator_affine.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/predicated_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/output_tile_thread_map.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/output_iterator_parameter.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/interleaved_epilogue.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion/visitors.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion/visitor_store.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion/visitor_load.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion/visitor_compute.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/fusion/visitor_2x.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_workspace.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_with_visitor_callbacks.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_with_visitor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_visitor_with_softmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_streamk_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_smem_accumulator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_gemm_k_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_direct_store.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_depthwise.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_base_streamk.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/epilogue.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/direct_store_epilogue_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_thread_map_wmma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_thread_map_volta_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_thread_map_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_thread_map_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_wmma_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_volta_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_tensor_op_blas3.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_direct_store.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_complex_tensor_op_blas3.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/threadblock/default_epilogue_complex_tensor_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/scale_type.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/reduction_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_with_elementwise.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_tensor_broadcast.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_silu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_sigmoid.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_residual_block.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_relu0.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_relu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_leaky_relu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_hardswish.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_generic_with_scaling.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_generic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_gelu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_drelu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_dgelu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_clamp.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_bias_relu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination_bias_elementwise.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/linear_combination.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/detail.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/conversion_op.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/thread/activation.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/sm90_visitor_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/sm90_visitor_store_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/sm90_visitor_load_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/sm90_visitor_compute_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/sm90_callbacks_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/operations.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/fusion/callbacks.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/dispatch_policy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/sm90_epilogue_tma_warpspecialized_bias_elementwise.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/sm90_epilogue_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/sm70_epilogue_vectorized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/epilogue_tensor_broadcast.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/detail.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/default_epilogue_array.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/default_epilogue.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/collective_epilogue.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/collective_builder.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/builders -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/epilogue/collective/builders/sm90_builder.inl -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/device_kernel.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail/mma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail/layout.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail/helper_macros.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail/dependent_false.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/detail/collective.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/cutlass.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/cuda_host_adapter.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/core_io.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/coord.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/warp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/warp/scale_bias_relu_transform.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/warp/mma_depthwise_simt_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/warp/mma_depthwise_simt.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/threadblock_swizzle.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/predicated_scale_bias_vector_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/predicated_scale_bias_vector_access_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/implicit_gemm_wgrad_fusion_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/implicit_gemm_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/implicit_gemm_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/implicit_gemm_fprop_fusion_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_mma_core_with_lane_access_size.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_mma_base.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_fprop_pipelined.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_fprop_filter_tile_access_iterator_direct_conv_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_fprop_direct_conv_multistage.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_fprop_activation_tile_access_iterator_direct_conv_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_fprop_activation_tile_access_iterator_direct_conv_fixed_stride_dilation.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/depthwise_direct_conv_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_wgrad_output_gradient_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_wgrad_output_gradient_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_wgrad_activation_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_wgrad_activation_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_fprop_filter_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_fprop_filter_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_fprop_activation_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_fprop_activation_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_dgrad_output_gradient_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_dgrad_output_gradient_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_dgrad_filter_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv3d_dgrad_filter_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_wgrad_output_gradient_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_wgrad_output_gradient_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_wgrad_activation_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_wgrad_activation_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_tile_iterator.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_params.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_filter_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_filter_tile_access_iterator_fixed_channels.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_filter_tile_access_iterator_few_channels.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_filter_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_activation_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_activation_tile_access_iterator_fixed_channels.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_activation_tile_access_iterator_few_channels.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_fprop_activation_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_dgrad_output_gradient_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_dgrad_output_gradient_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_dgrad_filter_tile_access_iterator_optimized.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/threadblock/conv2d_dgrad_filter_tile_access_iterator_analytic.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/thread/depthwise_mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/sm90_implicit_gemm_tma_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/implicit_gemm_convolution_with_fused_epilogue.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/implicit_gemm_convolution_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/implicit_gemm_convolution_strided_dgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/implicit_gemm_convolution_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/implicit_gemm_convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/direct_convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_depthwise_fprop.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_deconv3d_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_deconv3d.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_deconv2d_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_deconv2d.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv3d_wgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv3d_fprop_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv3d_fprop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv3d_fprop.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv3d_dgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_wgrad_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_wgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_group_fprop.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_fprop_with_reduction.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_fprop_with_broadcast.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_fprop_with_absmax.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_fprop_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_fprop.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d_dgrad.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/default_conv2d.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/kernel/conv_universal.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/dispatch_policy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/device -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/device/implicit_gemm_convolution_fusion.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/device/implicit_gemm_convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/device/direct_convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/device/conv_universal_adapter.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/convnd_problem_shape.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/conv3d_problem_size.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/conv2d_problem_size.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/sm90_implicit_gemm_gmma_ss_warpspecialized.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/detail.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/collective_conv.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/collective_builder.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/builders -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/builders/sm90_gmma_builder.inl -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/conv/collective/builders/sm90_common.inl -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/constants.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/cluster_launch.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/block_striped.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/blas3_types.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/blas3.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/bfloat16.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/barrier.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/array_subbyte.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/array_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/array.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/wmma_sm75.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/wmma_sm72.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/wmma_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/wmma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/simd_sm61.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/simd_sm60.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/simd.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/reg_reconfig.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sparse_sm89.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sparse_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm90.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm89.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm75.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm70.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm61.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm60.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma_sm50.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/mma.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/memory_sm80.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/memory_sm75.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/memory.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/cache_operation.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/barrier.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/arch/arch.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/aligned_buffer.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/util -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/util/type_traits.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/util/print.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/util/debug.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/underscore.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/tensor_predicate.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/tensor.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/swizzle_layout.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/swizzle.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/stride.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/pointer_swizzle.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/pointer_flagged.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/pointer_base.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/pointer.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/real.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/numeric_types.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/math.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/integral_ratio.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/integral_constant.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/integer_sequence.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/int.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/complex.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/numeric/arithmetic_tuple.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/layout_composed.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/layout.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/int_tuple.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/type_list.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/tuple.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/cuda_types.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/bit_field.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/array_subbyte.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/array_aligned.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/array.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/container/alignment.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/config.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm90_gmma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm90.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm80.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm75.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm70.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits_sm61.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_traits.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/mma_atom.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm90_tma_swizzle.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm90_tma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm90_im2col.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm90.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm80.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm75.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits_sm50.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_traits.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/atom/copy_atom.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/util.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm90_gmma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm90_desc.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm90.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm80.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm75.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm70.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma_sm61.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/mma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm90_tma.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm90_desc.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm90.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm80.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm75.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy_sm50.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/copy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/arch/cluster_sm90.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/tuple_algorithms.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/tensor_algorithms.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/prefetch.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/prefer.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/gemm.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/functional.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/fill.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/copy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/cooperative_gemm.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/cooperative_copy.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/clear.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cute/algorithm/axpby.hpp -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/cutlass/version_extended.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass/bin -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass/lib64 -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass/ctest -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/ -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/type_traits.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/tensor_view_io.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/trmm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/trmm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_reduce.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_reduce.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_norm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_foreach.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_fill.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_fill.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_elementwise.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_copy.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_compare.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/tensor_compare.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/symm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/symm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/rank_k_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/rank_2k_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/rank_2k.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/gett.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/gemm_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/gemm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/error_metrics.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/host/conv.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/thread -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/thread/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/tensor_relu.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/tensor_reduce.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/tensor_foreach.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/tensor_fill.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/tensor_compare.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/rank_2k_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/kernel -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/kernel/tensor_foreach.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/kernel/tensor_elementwise.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/kernel/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/gett.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/gemm_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/gemm_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/gemm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/device/convolution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/detail -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/detail/linear_to_coordinate.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/reference/detail/inner_product.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/print_error.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/packed_stride.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/index_sequence.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/host_uncompress.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/host_tensor_planar_complex.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/host_tensor.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/host_reorder.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/helper_cuda.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/gett_commandline.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/exceptions.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/distribution.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_utils.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_rmsnorm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_nhwc_to_nchw.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_nhwc_pooling.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_nhwc_padding.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_nchw_to_nhwc.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_memory.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_layernorm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_groupnorm.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/device_dump.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/debug.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/cublas_wrappers.hpp -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/command_line.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/util/GPU_Clock.hpp -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include/ -- Up-to-date: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/util.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/types.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/singleton.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/operation_table.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/manifest.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/library.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/handle.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/descriptions.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/include//cutlass/library/arch_mappings.h -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_cgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_cgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_dgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_dgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_sgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_sgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm60_hgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm60_hgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_igemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_igemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_s8_igemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_s8_igemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex_array.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex_array.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex_array.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex_array.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i88128xorgemm_b1.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i88128xorgemm_b1.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s4_i8832gemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s4_i8832gemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s8_i8816gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s8_i8816gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u4_i8832gemm_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u4_i8832gemm_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u8_i8816gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u8_i8816gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688tf32gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688tf32gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_cgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_cgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_d884gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_d884gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_dgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_dgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16832spgemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16832spgemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_gz884gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_gz884gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_grouped.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_grouped.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex_array.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex_array.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_s8_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_s8_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16832spgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16832spgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168128spgemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168128spgemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256andgemm_b1.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256andgemm_b1.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256xorgemm_b1.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256xorgemm_b1.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864spgemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864spgemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816tf32spgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816tf32spgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688bf16gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688bf16gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688f16gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688f16gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688tf32gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688tf32gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i168128spgemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i168128spgemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i16864gemm_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i16864gemm_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16832gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16832gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16864spgemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16864spgemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_sgemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_sgemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_tf32_s1688gemm_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_tf32_s1688gemm_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u4_i16864gemm_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u4_i16864gemm_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u8_i16832gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u8_i16832gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_z884gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_z884gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_d1684gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_d1684gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_gz1684gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_gz1684gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_h64x128x16gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_h64x128x16gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8gemm_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8gemm_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8tf32gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8tf32gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_z1684gemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_z1684gemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sdgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sdgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sfprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sfprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_swgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_swgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm60_hfprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm60_hfprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_few_channels.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_few_channels.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_fixed_channels.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_fixed_channels.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_fixed_channels.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_fixed_channels.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32fprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32fprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sdgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sdgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sfprop_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sfprop_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_swgrad_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_swgrad_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_analytic.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_analytic.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816fprop3d_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816fprop3d_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816wgrad3d_optimized.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816wgrad3d_optimized.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_d884syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_d884syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688tf32syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688tf32syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_d1684syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_d1684syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684herk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684herk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684syrk.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684syrk.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_d884syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_d884syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688tf32syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688tf32syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_d1684syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_d1684syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684her2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684her2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684syr2k.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684syr2k.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688tf32trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688tf32trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_d884trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_d884trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_gz884trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_gz884trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688tf32trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688tf32trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_z884trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_z884trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_d1684trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_d1684trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_gz1684trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_gz1684trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_z1684trmm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_z1684trmm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_d884symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_d884symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688tf32symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688tf32symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_d1684symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_d1684symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684hemm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684hemm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684symm.so -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684symm.a -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/info/cutlass/generated_kernels.txt -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/bin/cutlass_profiler -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass/ctest/ctest_profiler/CTestTestfile.ctest_profiler.cmake -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test/cutlass/CTestTestfile.cmake -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/cmake/NvidiaCutlass/NvidiaCutlassConfig.cmake -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/cmake/NvidiaCutlass/NvidiaCutlassConfigVersion.cmake -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/cmake/NvidiaCutlass/NvidiaCutlassTargets.cmake -- Installing: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/cmake/NvidiaCutlass/NvidiaCutlassTargets-release.cmake ~/build/BUILD/cutlass-3.5.0-build/cutlass + popd + rm -rf /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/test + rm -rf /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/info + set +x Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/bin/cutlass_profiler Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sdgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sfprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_swgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm60_hfprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_few_channels.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_fixed_channels.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_fixed_channels.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32fprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sdgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sfprop_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_swgrad_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_analytic.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816fprop3d_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816wgrad3d_optimized.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_cgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_dgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_sgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm60_hgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_igemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_s8_igemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex_array.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex_array.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i88128xorgemm_b1.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s4_i8832gemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s8_i8816gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u4_i8832gemm_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u8_i8816gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688tf32gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_cgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_d884gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_dgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16832spgemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_gz884gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_grouped.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex_array.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_s8_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16832spgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168128spgemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256andgemm_b1.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256xorgemm_b1.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864spgemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816tf32spgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688bf16gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688f16gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688tf32gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i168128spgemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i16864gemm_s4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16832gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16864spgemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_sgemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_tf32_s1688gemm_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u4_i16864gemm_u4.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u8_i16832gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_z884gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_d1684gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_gz1684gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_h64x128x16gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8gemm_tf32.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8tf32gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_s8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_u8.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_f16.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_z1684gemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_d884syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688tf32syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_d1684syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684her2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684syr2k.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_d884syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688tf32syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_d1684syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684herk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684syrk.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_d884symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688tf32symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_d1684symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684hemm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684symm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688tf32trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_d884trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_gz884trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688tf32trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_z884trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_d1684trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_gz1684trmm.so Stripping: /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_z1684trmm.so + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/brp-strip /usr/bin/strip + /usr/lib/rpm/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-determinism --brp -j4 /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_gz1684symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_z1684symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_z884hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm90_d1684symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_gz884hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688tf32symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_s1688symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_d884symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688tf32hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688hemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_symm_sm80_c1688symm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_d1684trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_gz1684trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm90_z1684trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_z884trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_d884trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_s1688tf32trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_gz884trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_z1684her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_gz1684her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688tf32trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_trmm_sm80_c1688trmm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm90_d1684syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_z884her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_gz884her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_d884syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688tf32syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_s1688syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688syr2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688tf32her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_2k_sm80_c1688her2k.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_z1684herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_gz1684herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm90_d1684syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_z884herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_gz884herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688tf32syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_d884syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688tf32herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_s1688syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688syrk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_rank_k_sm80_c1688herk.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816wgrad3d_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816fprop3d_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_analytic.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_h16816dgrad3d_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_swgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sfprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_sdgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688f16dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16864fprop_optimized_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_i16832fprop_optimized_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816fprop_fixed_channels.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_h16816dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8832fprop_optimized_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_i8816fprop_optimized_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_fixed_channels.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688fprop_few_channels.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_h1688dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884wgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884fprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_s884dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_h884dgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_swgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm60_hfprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sfprop_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_sdgrad_optimized.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_z1684gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_i64x128x32gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8tf32gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x8gemm_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_i64x128x32gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_gz1684gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_h64x128x16gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_s64x128x16gemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_d1684gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832gemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u8_i16832gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_u4_i16864gemm_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16864spgemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s8_i16832gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_tf32_s1688gemm_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i168128spgemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_z884gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s4_i16864gemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_sgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688tf32gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688gemm_tf32.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688f16gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16832spgemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s1688bf16gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816tf32spgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_s8_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_u8_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_grouped_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864spgemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_s16816gemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16864gemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i16832gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256xorgemm_b1.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168256andgemm_b1.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_i168128spgemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_s8_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16832spgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_grouped.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex_array.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_h16816gemm_planar_complex.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16832spgemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_gz884gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_dgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_d884gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688tf32gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_c1688gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_cgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u8_i8816gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_u4_i8832gemm_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s8_i8816gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s4_i8832gemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_u4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8832gemm_s4.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_u8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i8816gemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_i88128xorgemm_b1.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex_array.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_h1688gemm_planar_complex.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex_array.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_h884gemm_planar_complex.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_igemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm61_s8_igemm_s8.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_sgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_dgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm60_hgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass_gemm_sm50_cgemm.a: replacing with normalized version /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/lib64/libcutlass.a: replacing with normalized version Scanned 63 directories and 1407 files, processed 346 inodes, 346 modified (346 replaced + 0 rewritten), 0 unsupported format, 0 errors Reading /builddir/build/BUILD/cutlass-3.5.0-build/SPECPARTS/rpm-debuginfo.specpart Processing files: cutlass-3.5.0-20240411.3.cu12_6.fc42.x86_64 Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.o18uWx + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + cd cutlass + DOCDIR=/builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/doc/cutlass + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export DOCDIR + /usr/bin/mkdir -p /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/doc/cutlass + cp -pr /builddir/build/BUILD/cutlass-3.5.0-build/cutlass/README.md /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/doc/cutlass + cp -pr /builddir/build/BUILD/cutlass-3.5.0-build/cutlass/docs /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/doc/cutlass + RPM_EC=0 ++ jobs -p + exit 0 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.8iZUUl + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + cd cutlass + LICENSEDIR=/builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/licenses/cutlass + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/licenses/cutlass + cp -pr /builddir/build/BUILD/cutlass-3.5.0-build/cutlass/LICENSE.txt /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT/usr/share/licenses/cutlass + RPM_EC=0 ++ jobs -p + exit 0 Provides: cutlass = 3.5.0-20240411.3.cu12_6.fc42 cutlass(x86-64) = 3.5.0-20240411.3.cu12_6.fc42 libcutlass.so()(64bit) libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_sdgrad_optimized.so()(64bit) libcutlass_conv2d_sm50_sfprop_optimized.so()(64bit) libcutlass_conv2d_sm50_swgrad_optimized.so()(64bit) libcutlass_conv2d_sm60_hfprop_optimized.so()(64bit) libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_h884dgrad_optimized.so()(64bit) libcutlass_conv2d_sm70_h884fprop_optimized.so()(64bit) libcutlass_conv2d_sm70_h884wgrad_optimized.so()(64bit) libcutlass_conv2d_sm70_s884dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_s884fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_s884wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_h1688dgrad_optimized.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_few_channels.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_fixed_channels.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_optimized.so()(64bit) libcutlass_conv2d_sm75_h1688wgrad_optimized.so()(64bit) libcutlass_conv2d_sm75_i8816fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm75_i8816fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm75_i8832fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm75_i8832fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_h16816dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_h16816fprop_fixed_channels.so()(64bit) libcutlass_conv2d_sm80_h16816fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_h16816wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_i16832fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm80_i16832fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm80_i16864fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm80_i16864fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688bf16fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s1688f16dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688f16fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688f16wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688tf32fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm80_sdgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_sfprop_optimized.so()(64bit) libcutlass_conv2d_sm80_swgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.so()(64bit) libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.so()(64bit) libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.so()(64bit) libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_h16816dgrad3d_analytic.so()(64bit) libcutlass_conv3d_sm80_h16816dgrad3d_optimized.so()(64bit) libcutlass_conv3d_sm80_h16816fprop3d_optimized.so()(64bit) libcutlass_conv3d_sm80_h16816wgrad3d_optimized.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.so()(64bit) libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.so()(64bit) libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.so()(64bit) libcutlass_gemm_sm50_cgemm.so()(64bit) libcutlass_gemm_sm50_dgemm.so()(64bit) libcutlass_gemm_sm50_sgemm.so()(64bit) libcutlass_gemm_sm60_hgemm.so()(64bit) libcutlass_gemm_sm61_igemm_s8.so()(64bit) libcutlass_gemm_sm61_s8_igemm_s8.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_f16.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm70_h884gemm.so()(64bit) libcutlass_gemm_sm70_h884gemm_planar_complex.so()(64bit) libcutlass_gemm_sm70_h884gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm70_s884gemm_f16.so()(64bit) libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm70_s884gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_h1688gemm.so()(64bit) libcutlass_gemm_sm75_h1688gemm_planar_complex.so()(64bit) libcutlass_gemm_sm75_h1688gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm75_i88128xorgemm_b1.so()(64bit) libcutlass_gemm_sm75_i8816gemm_s8.so()(64bit) libcutlass_gemm_sm75_i8816gemm_u8.so()(64bit) libcutlass_gemm_sm75_i8832gemm_s4.so()(64bit) libcutlass_gemm_sm75_i8832gemm_u4.so()(64bit) libcutlass_gemm_sm75_s1688gemm_f16.so()(64bit) libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_s4_i8832gemm_s4.so()(64bit) libcutlass_gemm_sm75_s8_i8816gemm_s8.so()(64bit) libcutlass_gemm_sm75_u4_i8832gemm_u4.so()(64bit) libcutlass_gemm_sm75_u8_i8816gemm_u8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.so()(64bit) libcutlass_gemm_sm80_c1688gemm.so()(64bit) libcutlass_gemm_sm80_c1688tf32gemm.so()(64bit) libcutlass_gemm_sm80_cgemm.so()(64bit) libcutlass_gemm_sm80_d884gemm.so()(64bit) libcutlass_gemm_sm80_dgemm.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16832spgemm_f16.so()(64bit) libcutlass_gemm_sm80_gz884gemm.so()(64bit) libcutlass_gemm_sm80_h16816gemm.so()(64bit) libcutlass_gemm_sm80_h16816gemm_grouped.so()(64bit) libcutlass_gemm_sm80_h16816gemm_planar_complex.so()(64bit) libcutlass_gemm_sm80_h16816gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm80_h16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_h16832spgemm.so()(64bit) libcutlass_gemm_sm80_i168128spgemm_s4.so()(64bit) libcutlass_gemm_sm80_i168256andgemm_b1.so()(64bit) libcutlass_gemm_sm80_i168256xorgemm_b1.so()(64bit) libcutlass_gemm_sm80_i16832gemm_s8.so()(64bit) libcutlass_gemm_sm80_i16832gemm_u8.so()(64bit) libcutlass_gemm_sm80_i16864gemm_s4.so()(64bit) libcutlass_gemm_sm80_i16864gemm_u4.so()(64bit) libcutlass_gemm_sm80_i16864spgemm_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16_u8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16_u8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_grouped_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_grouped_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_s8_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_u8_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_u8_f16.so()(64bit) libcutlass_gemm_sm80_s16816tf32spgemm.so()(64bit) libcutlass_gemm_sm80_s16832spgemm_bf16.so()(64bit) libcutlass_gemm_sm80_s16832spgemm_f16.so()(64bit) libcutlass_gemm_sm80_s1688bf16gemm.so()(64bit) libcutlass_gemm_sm80_s1688f16gemm.so()(64bit) libcutlass_gemm_sm80_s1688gemm.so()(64bit) libcutlass_gemm_sm80_s1688gemm_tf32.so()(64bit) libcutlass_gemm_sm80_s1688tf32gemm.so()(64bit) libcutlass_gemm_sm80_s4_i168128spgemm_s4.so()(64bit) libcutlass_gemm_sm80_s4_i16864gemm_s4.so()(64bit) libcutlass_gemm_sm80_s8_i16832gemm_s8.so()(64bit) libcutlass_gemm_sm80_s8_i16864spgemm_s8.so()(64bit) libcutlass_gemm_sm80_sgemm.so()(64bit) libcutlass_gemm_sm80_tf32_s1688gemm_tf32.so()(64bit) libcutlass_gemm_sm80_u4_i16864gemm_u4.so()(64bit) libcutlass_gemm_sm80_u8_i16832gemm_u8.so()(64bit) libcutlass_gemm_sm80_z884gemm.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_d1684gemm.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_gz1684gemm.so()(64bit) libcutlass_gemm_sm90_h64x128x16gemm.so()(64bit) libcutlass_gemm_sm90_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_s64x128x8gemm_tf32.so()(64bit) libcutlass_gemm_sm90_s64x128x8tf32gemm.so()(64bit) libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_void_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_void_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_void_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_z1684gemm.so()(64bit) libcutlass_rank_2k_sm80_c1688her2k.so()(64bit) libcutlass_rank_2k_sm80_c1688syr2k.so()(64bit) libcutlass_rank_2k_sm80_c1688tf32her2k.so()(64bit) libcutlass_rank_2k_sm80_c1688tf32syr2k.so()(64bit) libcutlass_rank_2k_sm80_d884syr2k.so()(64bit) libcutlass_rank_2k_sm80_gz884her2k.so()(64bit) libcutlass_rank_2k_sm80_gz884syr2k.so()(64bit) libcutlass_rank_2k_sm80_s1688syr2k.so()(64bit) libcutlass_rank_2k_sm80_s1688tf32syr2k.so()(64bit) libcutlass_rank_2k_sm80_z884her2k.so()(64bit) libcutlass_rank_2k_sm80_z884syr2k.so()(64bit) libcutlass_rank_2k_sm90_d1684syr2k.so()(64bit) libcutlass_rank_2k_sm90_gz1684her2k.so()(64bit) libcutlass_rank_2k_sm90_gz1684syr2k.so()(64bit) libcutlass_rank_2k_sm90_z1684her2k.so()(64bit) libcutlass_rank_2k_sm90_z1684syr2k.so()(64bit) libcutlass_rank_k_sm80_c1688herk.so()(64bit) libcutlass_rank_k_sm80_c1688syrk.so()(64bit) libcutlass_rank_k_sm80_c1688tf32herk.so()(64bit) libcutlass_rank_k_sm80_c1688tf32syrk.so()(64bit) libcutlass_rank_k_sm80_d884syrk.so()(64bit) libcutlass_rank_k_sm80_gz884herk.so()(64bit) libcutlass_rank_k_sm80_gz884syrk.so()(64bit) libcutlass_rank_k_sm80_s1688syrk.so()(64bit) libcutlass_rank_k_sm80_s1688tf32syrk.so()(64bit) libcutlass_rank_k_sm80_z884herk.so()(64bit) libcutlass_rank_k_sm80_z884syrk.so()(64bit) libcutlass_rank_k_sm90_d1684syrk.so()(64bit) libcutlass_rank_k_sm90_gz1684herk.so()(64bit) libcutlass_rank_k_sm90_gz1684syrk.so()(64bit) libcutlass_rank_k_sm90_z1684herk.so()(64bit) libcutlass_rank_k_sm90_z1684syrk.so()(64bit) libcutlass_symm_sm80_c1688hemm.so()(64bit) libcutlass_symm_sm80_c1688symm.so()(64bit) libcutlass_symm_sm80_c1688tf32hemm.so()(64bit) libcutlass_symm_sm80_c1688tf32symm.so()(64bit) libcutlass_symm_sm80_d884symm.so()(64bit) libcutlass_symm_sm80_gz884hemm.so()(64bit) libcutlass_symm_sm80_gz884symm.so()(64bit) libcutlass_symm_sm80_s1688symm.so()(64bit) libcutlass_symm_sm80_s1688tf32symm.so()(64bit) libcutlass_symm_sm80_z884hemm.so()(64bit) libcutlass_symm_sm80_z884symm.so()(64bit) libcutlass_symm_sm90_d1684symm.so()(64bit) libcutlass_symm_sm90_gz1684hemm.so()(64bit) libcutlass_symm_sm90_gz1684symm.so()(64bit) libcutlass_symm_sm90_z1684hemm.so()(64bit) libcutlass_symm_sm90_z1684symm.so()(64bit) libcutlass_trmm_sm80_c1688tf32trmm.so()(64bit) libcutlass_trmm_sm80_c1688trmm.so()(64bit) libcutlass_trmm_sm80_d884trmm.so()(64bit) libcutlass_trmm_sm80_gz884trmm.so()(64bit) libcutlass_trmm_sm80_s1688tf32trmm.so()(64bit) libcutlass_trmm_sm80_s1688trmm.so()(64bit) libcutlass_trmm_sm80_z884trmm.so()(64bit) libcutlass_trmm_sm90_d1684trmm.so()(64bit) libcutlass_trmm_sm90_gz1684trmm.so()(64bit) libcutlass_trmm_sm90_z1684trmm.so()(64bit) Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: ld-linux-x86-64.so.2()(64bit) ld-linux-x86-64.so.2(GLIBC_2.3)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.34)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libcutlass.so()(64bit) libcutlass_conv2d_sm50_cf32_cdgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_cf32_cfprop_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_cf32_cwgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm50_sdgrad_optimized.so()(64bit) libcutlass_conv2d_sm50_sfprop_optimized.so()(64bit) libcutlass_conv2d_sm50_swgrad_optimized.so()(64bit) libcutlass_conv2d_sm60_hfprop_optimized.so()(64bit) libcutlass_conv2d_sm70_f16_s884dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_f16_s884fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_f16_s884wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_h884dgrad_optimized.so()(64bit) libcutlass_conv2d_sm70_h884fprop_optimized.so()(64bit) libcutlass_conv2d_sm70_h884wgrad_optimized.so()(64bit) libcutlass_conv2d_sm70_s884dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_s884fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm70_s884wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_cf32_cdgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_cf32_cfprop_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_cf32_cwgrad_optimized_cf32.so()(64bit) libcutlass_conv2d_sm75_f16_s1688dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_few_channels_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_f16_s1688wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_h1688dgrad_optimized.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_few_channels.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_fixed_channels.so()(64bit) libcutlass_conv2d_sm75_h1688fprop_optimized.so()(64bit) libcutlass_conv2d_sm75_h1688wgrad_optimized.so()(64bit) libcutlass_conv2d_sm75_i8816fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm75_i8816fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm75_i8832fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm75_i8832fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm75_s1688dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_few_channels_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm75_s1688fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s1688wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm75_s4_i8832fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_few_channels_s8.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_fixed_channels_s8.so()(64bit) libcutlass_conv2d_sm75_s8_i8816fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm75_u4_i8832fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_few_channels_u8.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_fixed_channels_u8.so()(64bit) libcutlass_conv2d_sm75_u8_i8816fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816dgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816fprop_fixed_channels_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816fprop_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_bf16_s16816wgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_f16_s16816wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_h16816dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_h16816fprop_fixed_channels.so()(64bit) libcutlass_conv2d_sm80_h16816fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_h16816wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_i16832fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm80_i16832fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm80_i16864fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm80_i16864fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm80_s16816dgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816dgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_fixed_channels_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_fixed_channels_f16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816fprop_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s16816wgrad_optimized_bf16.so()(64bit) libcutlass_conv2d_sm80_s16816wgrad_optimized_f16.so()(64bit) libcutlass_conv2d_sm80_s1688bf16dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688bf16fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688bf16wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688dgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s1688f16dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688f16fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688f16wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688fprop_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s1688tf32dgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688tf32fprop_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688tf32wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688wgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_s1688wgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_s4_i16864fprop_optimized_s4.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_few_channels_s8.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_fixed_channels_s8.so()(64bit) libcutlass_conv2d_sm80_s8_i16832fprop_optimized_s8.so()(64bit) libcutlass_conv2d_sm80_sdgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_sfprop_optimized.so()(64bit) libcutlass_conv2d_sm80_swgrad_optimized.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688dgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688fprop_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_tf32_s1688wgrad_optimized_tf32.so()(64bit) libcutlass_conv2d_sm80_u4_i16864fprop_optimized_u4.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_few_channels_u8.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_fixed_channels_u8.so()(64bit) libcutlass_conv2d_sm80_u8_i16832fprop_optimized_u8.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e4m3.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_fixed_channels_e5m2.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_optimized_e4m3.so()(64bit) libcutlass_conv2d_sm89_s16832fprop_optimized_e5m2.so()(64bit) libcutlass_conv2d_sm90_h64x64x16dgrad_f16nhwc_f16nhwc_f16_f16_f16.so()(64bit) libcutlass_conv2d_sm90_h64x64x16fprop_f16nhwc_f16nhwc_f16_f16_f16.so()(64bit) libcutlass_conv2d_sm90_s64x64x16dgrad_bf16nhwc_bf16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16dgrad_f16nhwc_f16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16fprop_bf16nhwc_bf16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv2d_sm90_s64x64x16fprop_f16nhwc_f16nhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816dgrad3d_analytic_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816dgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816fprop3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_bf16_s16816wgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816dgrad3d_analytic_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816dgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816fprop3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_f16_s16816wgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_h16816dgrad3d_analytic.so()(64bit) libcutlass_conv3d_sm80_h16816dgrad3d_optimized.so()(64bit) libcutlass_conv3d_sm80_h16816fprop3d_optimized.so()(64bit) libcutlass_conv3d_sm80_h16816wgrad3d_optimized.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_analytic_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_analytic_f16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816dgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_s16816fprop3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816fprop3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm80_s16816wgrad3d_optimized_bf16.so()(64bit) libcutlass_conv3d_sm80_s16816wgrad3d_optimized_f16.so()(64bit) libcutlass_conv3d_sm90_h64x64x16dgrad_f16ndhwc_f16ndhwc_f16_f16_f16.so()(64bit) libcutlass_conv3d_sm90_h64x64x16fprop_f16ndhwc_f16ndhwc_f16_f16_f16.so()(64bit) libcutlass_conv3d_sm90_s64x64x16dgrad_bf16ndhwc_bf16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16dgrad_f16ndhwc_f16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16fprop_bf16ndhwc_bf16ndhwc_f32_f32_f32.so()(64bit) libcutlass_conv3d_sm90_s64x64x16fprop_f16ndhwc_f16ndhwc_f32_f32_f32.so()(64bit) libcutlass_gemm_sm50_cgemm.so()(64bit) libcutlass_gemm_sm50_dgemm.so()(64bit) libcutlass_gemm_sm50_sgemm.so()(64bit) libcutlass_gemm_sm60_hgemm.so()(64bit) libcutlass_gemm_sm61_igemm_s8.so()(64bit) libcutlass_gemm_sm61_s8_igemm_s8.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_f16.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm70_f16_s884gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm70_h884gemm.so()(64bit) libcutlass_gemm_sm70_h884gemm_planar_complex.so()(64bit) libcutlass_gemm_sm70_h884gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm70_s884gemm_f16.so()(64bit) libcutlass_gemm_sm70_s884gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm70_s884gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm75_f16_s1688gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_h1688gemm.so()(64bit) libcutlass_gemm_sm75_h1688gemm_planar_complex.so()(64bit) libcutlass_gemm_sm75_h1688gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm75_i88128xorgemm_b1.so()(64bit) libcutlass_gemm_sm75_i8816gemm_s8.so()(64bit) libcutlass_gemm_sm75_i8816gemm_u8.so()(64bit) libcutlass_gemm_sm75_i8832gemm_s4.so()(64bit) libcutlass_gemm_sm75_i8832gemm_u4.so()(64bit) libcutlass_gemm_sm75_s1688gemm_f16.so()(64bit) libcutlass_gemm_sm75_s1688gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm75_s1688gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm75_s4_i8832gemm_s4.so()(64bit) libcutlass_gemm_sm75_s8_i8816gemm_s8.so()(64bit) libcutlass_gemm_sm75_u4_i8832gemm_u4.so()(64bit) libcutlass_gemm_sm75_u8_i8816gemm_u8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16_s8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_bf16_u8.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_array_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_planar_complex_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_s8_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16816gemm_u8_bf16.so()(64bit) libcutlass_gemm_sm80_bf16_s16832spgemm_bf16.so()(64bit) libcutlass_gemm_sm80_c1688gemm.so()(64bit) libcutlass_gemm_sm80_c1688tf32gemm.so()(64bit) libcutlass_gemm_sm80_cgemm.so()(64bit) libcutlass_gemm_sm80_d884gemm.so()(64bit) libcutlass_gemm_sm80_dgemm.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16_s8.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_f16_u8.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16816gemm_u8_f16.so()(64bit) libcutlass_gemm_sm80_f16_s16832spgemm_f16.so()(64bit) libcutlass_gemm_sm80_gz884gemm.so()(64bit) libcutlass_gemm_sm80_h16816gemm.so()(64bit) libcutlass_gemm_sm80_h16816gemm_grouped.so()(64bit) libcutlass_gemm_sm80_h16816gemm_planar_complex.so()(64bit) libcutlass_gemm_sm80_h16816gemm_planar_complex_array.so()(64bit) libcutlass_gemm_sm80_h16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_h16832spgemm.so()(64bit) libcutlass_gemm_sm80_i168128spgemm_s4.so()(64bit) libcutlass_gemm_sm80_i168256andgemm_b1.so()(64bit) libcutlass_gemm_sm80_i168256xorgemm_b1.so()(64bit) libcutlass_gemm_sm80_i16832gemm_s8.so()(64bit) libcutlass_gemm_sm80_i16832gemm_u8.so()(64bit) libcutlass_gemm_sm80_i16864gemm_s4.so()(64bit) libcutlass_gemm_sm80_i16864gemm_u4.so()(64bit) libcutlass_gemm_sm80_i16864spgemm_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_bf16_u8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16_s8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_f16_u8.so()(64bit) libcutlass_gemm_sm80_s16816gemm_grouped_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_grouped_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_array_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_array_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_planar_complex_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_s8_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_s8_f16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_u8_bf16.so()(64bit) libcutlass_gemm_sm80_s16816gemm_u8_f16.so()(64bit) libcutlass_gemm_sm80_s16816tf32spgemm.so()(64bit) libcutlass_gemm_sm80_s16832spgemm_bf16.so()(64bit) libcutlass_gemm_sm80_s16832spgemm_f16.so()(64bit) libcutlass_gemm_sm80_s1688bf16gemm.so()(64bit) libcutlass_gemm_sm80_s1688f16gemm.so()(64bit) libcutlass_gemm_sm80_s1688gemm.so()(64bit) libcutlass_gemm_sm80_s1688gemm_tf32.so()(64bit) libcutlass_gemm_sm80_s1688tf32gemm.so()(64bit) libcutlass_gemm_sm80_s4_i168128spgemm_s4.so()(64bit) libcutlass_gemm_sm80_s4_i16864gemm_s4.so()(64bit) libcutlass_gemm_sm80_s8_i16832gemm_s8.so()(64bit) libcutlass_gemm_sm80_s8_i16864spgemm_s8.so()(64bit) libcutlass_gemm_sm80_sgemm.so()(64bit) libcutlass_gemm_sm80_tf32_s1688gemm_tf32.so()(64bit) libcutlass_gemm_sm80_u4_i16864gemm_u4.so()(64bit) libcutlass_gemm_sm80_u8_i16832gemm_u8.so()(64bit) libcutlass_gemm_sm80_z884gemm.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832fastaccumgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16832gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864fastaccumspgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e4m3.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e5m2.so()(64bit) libcutlass_gemm_sm89_s16864spgemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_bf16_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_d1684gemm.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_f16_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_gz1684gemm.so()(64bit) libcutlass_gemm_sm90_h64x128x16gemm.so()(64bit) libcutlass_gemm_sm90_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_s64x128x8gemm_tf32.so()(64bit) libcutlass_gemm_sm90_s64x128x8tf32gemm.so()(64bit) libcutlass_gemm_sm90_s8_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_s8_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_void_i64x128x32gemm_s8.so()(64bit) libcutlass_gemm_sm90_void_i64x128x32gemm_u8.so()(64bit) libcutlass_gemm_sm90_void_s64x128x16gemm_bf16.so()(64bit) libcutlass_gemm_sm90_void_s64x128x16gemm_f16.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e4m3_e5m2.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2.so()(64bit) libcutlass_gemm_sm90_void_s64x128x32gemm_e5m2_e4m3.so()(64bit) libcutlass_gemm_sm90_z1684gemm.so()(64bit) libcutlass_rank_2k_sm80_c1688her2k.so()(64bit) libcutlass_rank_2k_sm80_c1688syr2k.so()(64bit) libcutlass_rank_2k_sm80_c1688tf32her2k.so()(64bit) libcutlass_rank_2k_sm80_c1688tf32syr2k.so()(64bit) libcutlass_rank_2k_sm80_d884syr2k.so()(64bit) libcutlass_rank_2k_sm80_gz884her2k.so()(64bit) libcutlass_rank_2k_sm80_gz884syr2k.so()(64bit) libcutlass_rank_2k_sm80_s1688syr2k.so()(64bit) libcutlass_rank_2k_sm80_s1688tf32syr2k.so()(64bit) libcutlass_rank_2k_sm80_z884her2k.so()(64bit) libcutlass_rank_2k_sm80_z884syr2k.so()(64bit) libcutlass_rank_2k_sm90_d1684syr2k.so()(64bit) libcutlass_rank_2k_sm90_gz1684her2k.so()(64bit) libcutlass_rank_2k_sm90_gz1684syr2k.so()(64bit) libcutlass_rank_2k_sm90_z1684her2k.so()(64bit) libcutlass_rank_2k_sm90_z1684syr2k.so()(64bit) libcutlass_rank_k_sm80_c1688herk.so()(64bit) libcutlass_rank_k_sm80_c1688syrk.so()(64bit) libcutlass_rank_k_sm80_c1688tf32herk.so()(64bit) libcutlass_rank_k_sm80_c1688tf32syrk.so()(64bit) libcutlass_rank_k_sm80_d884syrk.so()(64bit) libcutlass_rank_k_sm80_gz884herk.so()(64bit) libcutlass_rank_k_sm80_gz884syrk.so()(64bit) libcutlass_rank_k_sm80_s1688syrk.so()(64bit) libcutlass_rank_k_sm80_s1688tf32syrk.so()(64bit) libcutlass_rank_k_sm80_z884herk.so()(64bit) libcutlass_rank_k_sm80_z884syrk.so()(64bit) libcutlass_rank_k_sm90_d1684syrk.so()(64bit) libcutlass_rank_k_sm90_gz1684herk.so()(64bit) libcutlass_rank_k_sm90_gz1684syrk.so()(64bit) libcutlass_rank_k_sm90_z1684herk.so()(64bit) libcutlass_rank_k_sm90_z1684syrk.so()(64bit) libcutlass_symm_sm80_c1688hemm.so()(64bit) libcutlass_symm_sm80_c1688symm.so()(64bit) libcutlass_symm_sm80_c1688tf32hemm.so()(64bit) libcutlass_symm_sm80_c1688tf32symm.so()(64bit) libcutlass_symm_sm80_d884symm.so()(64bit) libcutlass_symm_sm80_gz884hemm.so()(64bit) libcutlass_symm_sm80_gz884symm.so()(64bit) libcutlass_symm_sm80_s1688symm.so()(64bit) libcutlass_symm_sm80_s1688tf32symm.so()(64bit) libcutlass_symm_sm80_z884hemm.so()(64bit) libcutlass_symm_sm80_z884symm.so()(64bit) libcutlass_symm_sm90_d1684symm.so()(64bit) libcutlass_symm_sm90_gz1684hemm.so()(64bit) libcutlass_symm_sm90_gz1684symm.so()(64bit) libcutlass_symm_sm90_z1684hemm.so()(64bit) libcutlass_symm_sm90_z1684symm.so()(64bit) libcutlass_trmm_sm80_c1688tf32trmm.so()(64bit) libcutlass_trmm_sm80_c1688trmm.so()(64bit) libcutlass_trmm_sm80_d884trmm.so()(64bit) libcutlass_trmm_sm80_gz884trmm.so()(64bit) libcutlass_trmm_sm80_s1688tf32trmm.so()(64bit) libcutlass_trmm_sm80_s1688trmm.so()(64bit) libcutlass_trmm_sm80_z884trmm.so()(64bit) libcutlass_trmm_sm90_d1684trmm.so()(64bit) libcutlass_trmm_sm90_gz1684trmm.so()(64bit) libcutlass_trmm_sm90_z1684trmm.so()(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.2.5)(64bit) libm.so.6(GLIBC_2.29)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.5)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.18)(64bit) libstdc++.so.6(GLIBCXX_3.4.20)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.5)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) rtld(GNU_HASH) Processing files: cutlass-devel-3.5.0-20240411.3.cu12_6.fc42.x86_64 Provides: cmake(NvidiaCutlass) = 3.5.0 cmake(nvidiacutlass) = 3.5.0 cutlass-devel = 3.5.0-20240411.3.cu12_6.fc42 cutlass-devel(x86-64) = 3.5.0-20240411.3.cu12_6.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: cmake-filesystem(x86-64) Processing files: cutlass-static-3.5.0-20240411.3.cu12_6.fc42.x86_64 Provides: cutlass-static = 3.5.0-20240411.3.cu12_6.fc42 cutlass-static(x86-64) = 3.5.0-20240411.3.cu12_6.fc42 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/cutlass-3.5.0-build/BUILDROOT Wrote: /builddir/build/RPMS/cutlass-devel-3.5.0-20240411.3.cu12_6.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/cutlass-static-3.5.0-20240411.3.cu12_6.fc42.x86_64.rpm Wrote: /builddir/build/RPMS/cutlass-3.5.0-20240411.3.cu12_6.fc42.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.u7RfEA + umask 022 + cd /builddir/build/BUILD/cutlass-3.5.0-build + test -d /builddir/build/BUILD/cutlass-3.5.0-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/cutlass-3.5.0-build + rm -rf /builddir/build/BUILD/cutlass-3.5.0-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm Finish: build phase for cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-rawhide-x86_64-1723840744.707232/root/var/log/dnf5.log INFO: Done(/var/lib/copr-rpmbuild/results/cutlass-3.5.0-20240411.3.cu12_6.fc42.src.rpm) Config(child) 407 minutes 10 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "cutlass-devel", "epoch": null, "version": "3.5.0", "release": "20240411.3.cu12_6.fc42", "arch": "x86_64" }, { "name": "cutlass", "epoch": null, "version": "3.5.0", "release": "20240411.3.cu12_6.fc42", "arch": "src" }, { "name": "cutlass", "epoch": null, "version": "3.5.0", "release": "20240411.3.cu12_6.fc42", "arch": "x86_64" }, { "name": "cutlass-static", "epoch": null, "version": "3.5.0", "release": "20240411.3.cu12_6.fc42", "arch": "x86_64" } ] } RPMResults finished